1
|
Wang Z, Hu B, Zhang M, Li J, Li L, Gong M, Gao X. Diffusion Model-Based Visual Compensation Guidance and Visual Difference Analysis for No-Reference Image Quality Assessment. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; PP:263-278. [PMID: 40030878 DOI: 10.1109/tip.2024.3523800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Existing free-energy guided No-Reference Image Quality Assessment (NR-IQA) methods continue to face challenges in effectively restoring complexly distorted images. The features guiding the main network for quality assessment lack interpretability, and efficiently leveraging high-level feature information remains a significant challenge. As a novel class of state-of-the-art (SOTA) generative model, the diffusion model exhibits the capability to model intricate relationships, enhancing image restoration effectiveness. Moreover, the intermediate variables in the denoising iteration process exhibit clearer and more interpretable meanings for high-level visual information guidance. In view of these, we pioneer the exploration of the diffusion model into the domain of NR-IQA. We design a novel diffusion model for enhancing images with various types of distortions, resulting in higher quality and more interpretable high-level visual information. Our experiments demonstrate that the diffusion model establishes a clear mapping relationship between image reconstruction and image quality scores, which the network learns to guide quality assessment. Finally, to fully leverage high-level visual information, we design two complementary visual branches to collaboratively perform quality evaluation. Extensive experiments are conducted on seven public NR-IQA datasets, and the results demonstrate that the proposed model outperforms SOTA methods for NR-IQA. The codes will be available at https://github.com/handsomewzy/DiffV2IQA.
Collapse
|
2
|
Tian Y, Wen M, Lu D, Zhong X, Wu Z. Biological Basis and Computer Vision Applications of Image Phase Congruency: A Comprehensive Survey. Biomimetics (Basel) 2024; 9:422. [PMID: 39056863 PMCID: PMC11274423 DOI: 10.3390/biomimetics9070422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 07/04/2024] [Accepted: 07/08/2024] [Indexed: 07/28/2024] Open
Abstract
The concept of Image Phase Congruency (IPC) is deeply rooted in the way the human visual system interprets and processes spatial frequency information. It plays an important role in visual perception, influencing our capacity to identify objects, recognize textures, and decipher spatial relationships in our environments. IPC is robust to changes in lighting, contrast, and other variables that might modify the amplitude of light waves yet leave their relative phase unchanged. This characteristic is vital for perceptual tasks as it ensures the consistent detection of features regardless of fluctuations in illumination or other environmental factors. It can also impact cognitive and emotional responses; cohesive phase information across elements fosters a perception of unity or harmony, while inconsistencies can engender a sense of discord or tension. In this survey, we begin by examining the evidence from biological vision studies suggesting that IPC is employed by the human perceptual system. We proceed to outline the typical mathematical representation and different computational approaches to IPC. We then summarize the extensive applications of IPC in computer vision, including denoise, image quality assessment, feature detection and description, image segmentation, image registration, image fusion, and object detection, among other uses, and illustrate its advantages with a number of examples. Finally, we discuss the current challenges associated with the practical applications of IPC and potential avenues for enhancement.
Collapse
Affiliation(s)
- Yibin Tian
- College of Mechatronics and Control Engineering, Shenzhen University, Shenzhen 518060, China; (M.W.); (D.L.); (X.Z.); (Z.W.)
| | - Ming Wen
- College of Mechatronics and Control Engineering, Shenzhen University, Shenzhen 518060, China; (M.W.); (D.L.); (X.Z.); (Z.W.)
| | - Dajiang Lu
- College of Mechatronics and Control Engineering, Shenzhen University, Shenzhen 518060, China; (M.W.); (D.L.); (X.Z.); (Z.W.)
| | - Xiaopin Zhong
- College of Mechatronics and Control Engineering, Shenzhen University, Shenzhen 518060, China; (M.W.); (D.L.); (X.Z.); (Z.W.)
- Guangdong Digital Economy and Artificial Intelligence Lab., Shenzhen 518060, China
| | - Zongze Wu
- College of Mechatronics and Control Engineering, Shenzhen University, Shenzhen 518060, China; (M.W.); (D.L.); (X.Z.); (Z.W.)
- Guangdong Digital Economy and Artificial Intelligence Lab., Shenzhen 518060, China
| |
Collapse
|
3
|
Zhang N, Lin C. The Image Definition Assessment of Optoelectronic Tracking Equipment Based on the BRISQUE Algorithm with Gaussian Weights. SENSORS (BASEL, SWITZERLAND) 2023; 23:1621. [PMID: 36772660 PMCID: PMC9921252 DOI: 10.3390/s23031621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 01/12/2023] [Accepted: 01/28/2023] [Indexed: 06/18/2023]
Abstract
Defocus is an important factor that causes image quality degradation of optoelectronic tracking equipment in the shooting range. In this paper, an improved blind/referenceless image spatial quality evaluator (BRISQUE) algorithm is formulated by using the image characteristic extraction technology to obtain a characteristic vector (CV). The CV consists of 36 characteristic values that can effectively reflect the defocusing condition of the corresponding image. The image is evaluated and scored subjectively by the human eyes. The subjective evaluation scores and CVs constitute a set of training data samples for the defocusing evaluation model. An image database that contains sufficiently many training samples is constructed. The training model is trained to obtain the support vector machine (SVM) model by using the regression function of the SVM. In the experiments, the BRISQUE algorithm is used to obtain the image feature vector. The method of establishing the image definition evaluation model via SVM is feasible and yields higher subjective and objective consistency.
Collapse
Affiliation(s)
- Ning Zhang
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
| | - Cui Lin
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
4
|
Zhang W, Liu Y, Dong C, Qiao Y. RankSRGAN: Super Resolution Generative Adversarial Networks With Learning to Rank. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:7149-7166. [PMID: 34310284 DOI: 10.1109/tpami.2021.3096327] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Generative Adversarial Networks (GAN) have demonstrated the potential to recover realistic details for single image super-resolution (SISR). To further improve the visual quality of super-resolved results, PIRM2018-SR Challenge employed perceptual metrics to assess the perceptual quality, such as PI, NIQE, and Ma. However, existing methods cannot directly optimize these indifferentiable perceptual metrics, which are shown to be highly correlated with human ratings. To address the problem, we propose Super-Resolution Generative Adversarial Networks with Ranker (RankSRGAN) to optimize generator in the direction of different perceptual metrics. Specifically, we first train a Ranker which can learn the behaviour of perceptual metrics and then introduce a novel rank-content loss to optimize the perceptual quality. The most appealing part is that the proposed method can combine the strengths of different SR methods to generate better results. Furthermore, we extend our method to multiple Rankers to provide multi-dimension constraints for the generator. Extensive experiments show that RankSRGAN achieves visually pleasing results and reaches state-of-the-art performance in perceptual metrics and quality. Project page: https://wenlongzhang0517.github.io/Projects/RankSRGAN.
Collapse
|
5
|
Dong X, Fu L, Liu Q. No-reference image quality assessment for confocal endoscopy images with perceptual local descriptor. JOURNAL OF BIOMEDICAL OPTICS 2022; 27:056503. [PMID: 35585672 PMCID: PMC9116465 DOI: 10.1117/1.jbo.27.5.056503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 04/29/2022] [Indexed: 06/15/2023]
Abstract
SIGNIFICANCE Confocal endoscopy images often suffer distortions, resulting in image quality degradation and information loss, increasing the difficulty of diagnosis and even leading to misdiagnosis. It is important to assess image quality and filter images with low diagnostic value before diagnosis. AIM We propose a no-reference image quality assessment (IQA) method for confocal endoscopy images based on Weber's law and local descriptors. The proposed method can detect the severity of image degradation by capturing the perceptual structure of an image. APPROACH We created a new dataset of 642 confocal endoscopy images to validate the performance of the proposed method. We then conducted extensive experiments to compare the accuracy and speed of the proposed method with other state-of-the-art IQA methods. RESULTS Experimental results demonstrate that the proposed method achieved an SROCC of 0.85 and outperformed other IQA methods. CONCLUSIONS Given its high consistency in subjective quality assessment, the proposed method can screen high-quality images in practical applications and contribute to diagnosis.
Collapse
Affiliation(s)
- Xiangjiang Dong
- Huazhong University of Science and Technology, Wuhan National Laboratory for Optoelectronics, Wuhan, China
| | - Ling Fu
- Huazhong University of Science and Technology, Wuhan National Laboratory for Optoelectronics, Wuhan, China
- Hainan University, School of Biomedical Engineering, Key Laboratory of Biomedical Engineering of Hainan Province, Hainan, China
| | - Qian Liu
- Hainan University, School of Biomedical Engineering, Key Laboratory of Biomedical Engineering of Hainan Province, Hainan, China
| |
Collapse
|
6
|
Öztürk MM. A tuned feed-forward deep neural network algorithm for effort estimation. J EXP THEOR ARTIF IN 2022. [DOI: 10.1080/0952813x.2021.1871664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Muhammed Maruf Öztürk
- Department of Computer Engineering Faculty of Engineering, Suleyman Demirel University Isparta, Turkey
| |
Collapse
|
7
|
|
8
|
Pan Z, Yuan F, Lei J, Fang Y, Shao X, Kwong S. VCRNet: Visual Compensation Restoration Network for No-Reference Image Quality Assessment. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:1613-1627. [PMID: 35081029 DOI: 10.1109/tip.2022.3144892] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Guided by the free-energy principle, generative adversarial networks (GAN)-based no-reference image quality assessment (NR-IQA) methods have improved the image quality prediction accuracy. However, the GAN cannot well handle the restoration task for the free-energy principle-guided NR-IQA methods, especially for the severely destroyed images, which results in that the quality reconstruction relationship between the distorted image and its restored image cannot be accurately built. To address this problem, a visual compensation restoration network (VCRNet)-based NR-IQA method is proposed, which uses a non-adversarial model to efficiently handle the distorted image restoration task. The proposed VCRNet consists of a visual restoration network and a quality estimation network. To accurately build the quality reconstruction relationship between the distorted image and its restored image, a visual compensation module, an optimized asymmetric residual block, and an error map-based mixed loss function, are proposed for increasing the restoration capability of the visual restoration network. For further addressing the NR-IQA problem of severely destroyed images, the multi-level restoration features which are obtained from the visual restoration network are used for the image quality estimation. To prove the effectiveness of the proposed VCRNet, seven representative IQA databases are used, and experimental results show that the proposed VCRNet achieves the state-of-the-art image quality prediction accuracy. The implementation of the proposed VCRNet has been released at https://github.com/NUIST-Videocoding/VCRNet.
Collapse
|
9
|
Sandic-Stankovic DD, Kukolj DD, Le Callet P. Quality Assessment of DIBR-Synthesized Views Based on Sparsity of Difference of Closings and Difference of Gaussians. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:1161-1175. [PMID: 34990360 DOI: 10.1109/tip.2021.3139238] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Images synthesized using depth-image-based-rendering (DIBR) techniques may suffer from complex structural distortions. The goal of the primary visual cortex and other parts of brain is to reduce redundancies of input visual signal in order to discover the intrinsic image structure, and thus create sparse image representation. Human visual system (HVS) treats images on several scales and several levels of resolution when perceiving the visual scene. With an attempt to emulate the properties of HVS, we have designed the no-reference model for the quality assessment of DIBR-synthesized views. To extract a higher-order structure of high curvature which corresponds to distortion of shapes to which the HVS is highly sensitive, we define a morphological oriented Difference of Closings (DoC) operator and use it at multiple scales and resolutions. DoC operator nonlinearly removes redundancies and extracts fine grained details, texture of an image local structure and contrast to which HVS is highly sensitive. We introduce a new feature based on sparsity of DoC band. To extract perceptually important low-order structural information (edges), we use the non-oriented Difference of Gaussians (DoG) operator at different scales and resolutions. Measure of sparsity is calculated for DoG bands to get scalar features. To model the relationship between the extracted features and subjective scores, the general regression neural network (GRNN) is used. Quality predictions by the proposed DoC-DoG-GRNN model show higher compatibility with perceptual quality scores in comparison to the tested state-of-the-art metrics when evaluated on four benchmark datasets with synthesized views, IRCCyN/IVC image/video dataset, MCL-3D stereoscopic image dataset and IST image dataset.
Collapse
|
10
|
Abouelaziz I, Chetouani A, El Hassouni M, Latecki LJ, Cherifi H. 3D visual saliency and convolutional neural network for blind mesh quality assessment. Neural Comput Appl 2020. [DOI: 10.1007/s00521-019-04521-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
11
|
Fan YY, Sang YJ. A No-Reference Image Quality Comprehensive Assessment Method. INT J PATTERN RECOGN 2020. [DOI: 10.1142/s0218001421540112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
On the basis of the research status of image quality comprehensive assessment, a no-reference image quality comprehensive assessment function model is proposed in this paper. First, the image quality is classified as contrast, sharpness, and signal-to-noise ratio (SNR), and the interrelation of each assessment index is researched and analyzed; second, the weights in the comprehensive assessment model are studied when only contrast, sharpness, and SNR are changed. Finally, on the basis of studying each kind of distortion separately, and considering the different types of image distortion, we studied how to determine the weights of each index in the comprehensive image quality assessment. The results show that the no-reference image quality comprehensive assessment function model proposed in this paper can better fit human visual perception, and it has a good correlation with Difference Mean Opinion Score (DMOS). Correlation Coefficient (CC) reached 0.8331, Spearman Rank Order Correlation Coefficient (SROCC) reached 0.8206, Mean Absolute Error (MAE) was only 0.0920, Root Mean Square Error (RMSE) was only 0.1122, Outlier Ratio (OR) was only 0.0365. The method proposed in this paper can be applied to photoelectric measurement equipment television system and give an accurate and reliable quality assessment to no reference television images.
Collapse
Affiliation(s)
- Yuan-Yuan Fan
- Faculty of Mathematics and Physics, Huaiyin Institute of Technology, Huaian 223003, P. R. China
| | - Ying-Jun Sang
- Faculty of Automation, Huaiyin Institute of Technology, Huaian 223003, P. R. China
| |
Collapse
|
12
|
Ko H, Lee DY, Cho S, Bovik AC. Quality Prediction on Deep Generative Images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:5964-5979. [PMID: 32310772 DOI: 10.1109/tip.2020.2987180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In recent years, deep neural networks have been utilized in a wide variety of applications including image generation. In particular, generative adversarial networks (GANs) are able to produce highly realistic pictures as part of tasks such as image compression. As with standard compression, it is desirable to be able to automatically assess the perceptual quality of generative images to monitor and control the encode process. However, existing image quality algorithms are ineffective on GAN generated content, especially on textured regions and at high compressions. Here we propose a new "naturalness"-based image quality predictor for generative images. Our new GAN picture quality predictor is built using a multi-stage parallel boosting system based on structural similarity features and measurements of statistical similarity. To enable model development and testing, we also constructed a subjective GAN image quality database containing (distorted) GAN images and collected human opinions of them. Our experimental results indicate that our proposed GAN IQA model delivers superior quality predictions on the generative image datasets, as well as on traditional image quality datasets.
Collapse
|
13
|
Cui Y. No-Reference Image Quality Assessment Based on Dual-Domain Feature Fusion. ENTROPY 2020; 22:e22030344. [PMID: 33286117 PMCID: PMC7516814 DOI: 10.3390/e22030344] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Revised: 03/13/2020] [Accepted: 03/15/2020] [Indexed: 12/03/2022]
Abstract
Image quality assessment (IQA) aims to devise computational models to evaluate image quality in a perceptually consistent manner. In this paper, a novel no-reference image quality assessment model based on dual-domain feature fusion is proposed, dubbed as DFF-IQA. Firstly, in the spatial domain, several features about weighted local binary pattern, naturalness and spatial entropy are extracted, where the naturalness features are represented by fitting parameters of the generalized Gaussian distribution. Secondly, in the frequency domain, the features of spectral entropy, oriented energy distribution, and fitting parameters of asymmetrical generalized Gaussian distribution are extracted. Thirdly, the features extracted in the dual-domain are fused to form the quality-aware feature vector. Finally, quality regression process by random forest is conducted to build the relationship between image features and quality score, yielding a measure of image quality. The resulting algorithm is tested on the LIVE database and compared with competing IQA models. Experimental results on the LIVE database indicate that the proposed DFF-IQA method is more consistent with the human visual system than other competing IQA methods.
Collapse
Affiliation(s)
- Yueli Cui
- School of Electronic and Information Engineering, Taizhou University, Taizhou 318017, China
| |
Collapse
|
14
|
Barricelli BR, Casiraghi E, Lecca M, Plutino A, Rizzi A. A cockpit of multiple measures for assessing film restoration quality. Pattern Recognit Lett 2020. [DOI: 10.1016/j.patrec.2020.01.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
15
|
Zhang P, Meng J, Luan Y, Liu C. Plant miRNA-lncRNA Interaction Prediction with the Ensemble of CNN and IndRNN. Interdiscip Sci 2019; 12:82-89. [PMID: 31811618 DOI: 10.1007/s12539-019-00351-w] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2019] [Revised: 10/11/2019] [Accepted: 11/19/2019] [Indexed: 12/22/2022]
Abstract
Non-coding RNA (ncRNA) plays an important role in regulating biological activities of animals and plants, and the representative ones are microRNA (miRNA) and long non-coding RNA (lncRNA). Recent research has found that predicting the interaction between miRNA and lncRNA is the primary task for elucidating their functional mechanisms. Due to the small scale of data, a large amount of noise, and the limitations of human factors, the prediction accuracy and reliability of traditional feature-based classification methods are often affected. Besides, the structure of plant ncRNA is complex. This paper proposes an ensemble deep-learning model based on convolutional neural network (CNN) and independently recurrent neural network (IndRNN) for predicting the interaction between miRNA and lncRNA of plants, namely, CIRNN. The model uses CNN to explore the functional features of gene sequences automatically, leverages IndRNN to obtain the representation of sequence features, and learns the dependencies among sequences; thus, it overcomes the inaccuracy caused by human factors in traditional feature engineering. The experiment results show that the proposed model is superior to shallow machine-learning and existing deep-learning models when dealing with large-scale data, especially for the long sequence.
Collapse
Affiliation(s)
- Peng Zhang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, Liaoning, China
| | - Jun Meng
- School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, Liaoning, China.
| | - Yushi Luan
- School of Bioengineering, Dalian University of Technology, Dalian, 116024, Liaoning, China
| | - Chanjuan Liu
- School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, Liaoning, China
| |
Collapse
|
16
|
Wu J, Zhang M, Li L, Dong W, Shi G, Lin W. No-reference image quality assessment with visual pattern degradation. Inf Sci (N Y) 2019. [DOI: 10.1016/j.ins.2019.07.061] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
17
|
Yan Q, Gong D, Zhang Y. Two-Stream Convolutional Networks for Blind Image Quality Assessment. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:2200-2211. [PMID: 30507506 DOI: 10.1109/tip.2018.2883741] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Traditional image quality assessment (IQA) methods do not perform robustly due to the shallow hand-designed features. It has been demonstrated that deep neural network can learn more effective features than ever. In this paper, we describe a new deep neural network to predict the image quality accurately without relying on the reference image. To learn more effective feature representations for non-reference IQA, we propose a two-stream convolution network that includes two subcomponents for image and gradient image. The motivation for this design is using a two-stream scheme to capture different-level information of inputs and easing the difficulty of extracting features from one steam. The gradient stream focuses on extracting structure features in details, and the image stream pays more attention to the information in intensity. In addition, to consider the locally non-uniform distribution of distortion in images, we add a region-based fully convolutional layer for using the information around the center of the input image patch. The final score of the overall image is calculated by averaging of the patch scores. The proposed network performs in an end-to-end manner in both the training and testing phases. The experimental results on a series of benchmark datasets, e.g., LIVE, CISQ, IVC, TID2013, and Waterloo Exploration Database, show that the proposed algorithm outperforms the state-of-the-art methods, which verifies the effectiveness of our network architecture.
Collapse
|
18
|
Kim J, Nguyen AD, Lee S. Deep CNN-Based Blind Image Quality Predictor. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:11-24. [PMID: 29994270 DOI: 10.1109/tnnls.2018.2829819] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Image recognition based on convolutional neural networks (CNNs) has recently been shown to deliver the state-of-the-art performance in various areas of computer vision and image processing. Nevertheless, applying a deep CNN to no-reference image quality assessment (NR-IQA) remains a challenging task due to critical obstacles, i.e., the lack of a training database. In this paper, we propose a CNN-based NR-IQA framework that can effectively solve this problem. The proposed method-deep image quality assessor (DIQA)-separates the training of NR-IQA into two stages: 1) an objective distortion part and 2) a human visual system-related part. In the first stage, the CNN learns to predict the objective error map, and then the model learns to predict subjective score in the second stage. To complement the inaccuracy of the objective error map prediction on the homogeneous region, we also propose a reliability map. Two simple handcrafted features were additionally employed to further enhance the accuracy. In addition, we propose a way to visualize perceptual error maps to analyze what was learned by the deep CNN model. In the experiments, the DIQA yielded the state-of-the-art accuracy on the various databases.
Collapse
|
19
|
GROF: Indoor Localization Using a Multiple-Bandwidth General Regression Neural Network and Outlier Filter. SENSORS 2018; 18:s18113723. [PMID: 30388845 PMCID: PMC6263617 DOI: 10.3390/s18113723] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/11/2018] [Revised: 10/25/2018] [Accepted: 10/30/2018] [Indexed: 11/16/2022]
Abstract
In recent years, a variety of methods have been developed for indoor localization utilizing fingerprints of received signal strength (RSS) that are location dependent. Nevertheless, the RSS is sensitive to environmental variations, in that the resulting fluctuation severely degrades the localization accuracy. Furthermore, the fingerprints survey course is time-consuming and labor-intensive. Therefore, the lightweight fingerprint-based indoor positioning approach is preferred for practical applications. In this paper, a novel multiple-bandwidth generalized regression neural network (GRNN) with the outlier filter indoor positioning approach (GROF) is proposed. The GROF method is based on the GRNN, for which we adopt a new kind of multiple-bandwidth kernel architecture to achieve a more flexible regression performance than that of the traditional GRNN. In addition, an outlier filtering scheme adopting the k-nearest neighbor (KNN) method is embedded into the localization module so as to improve the localization robustness against environmental changes. We discuss the multiple-bandwidth spread value training process and the outlier filtering algorithm, and demonstrate the feasibility and performance of GROF through experiment data, using a Universal Software Radio Peripheral (USRP) platform. The experimental results indicate that the GROF method outperforms the positioning methods, based on the standard GRNN, KNN, or backpropagation neural network (BPNN), both in localization accuracy and robustness, without the extra training sample requirement.
Collapse
|
20
|
|
21
|
Zhang Y, Chandler DM. Opinion-Unaware Blind Quality Assessment of Multiply and Singly Distorted Images via Distortion Parameter Estimation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:5433-5448. [PMID: 30028705 DOI: 10.1109/tip.2018.2857413] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Over the past couple of decades, numerous image quality assessment (IQA) algorithms have been developed to estimate the quality of images that contain a single type of distortion. Although in practice, images can be contaminated by multiple distortions, previous research on quality assessment of multiply-distorted images is very limited. In this paper, we propose an efficient algorithm to blindly assess the quality of both multiply and singly distorted images based on predicting the distortion parameters using a bag of natural scene statistics (NSS) features. Our method, called MUltiply- and Singlydistorted Image QUality Estimator (MUSIQUE), operates via three main stages. In the first stage, a two-layer classification model is employed to identify the distortion types (i.e., Gaussian blur, JPEG compression, and white noise) that may exist in an image. In the second stage, specific regression models are employed to predict the three distortion parameters (i.e., σG for Gaussian blur, Q for JPEG compression, and σN for white noise) by learning the different NSS features for different distortion types and combinations. In the final stage, the three estimated distortion parameter values are mapped and combined into an overall quality estimate based on quality-mapping curves and the most-apparent-distortion strategy. Experimental results tested on three multiply-distorted and seven singly-distorted image quality databases demonstrate that the proposed MUSIQUE algorithm can achieve better/competitive performance as compared to other state-of-the-art FR/NR IQA algorithms.
Collapse
|
22
|
Gu K, Tao D, Qiao JF, Lin W. Learning a No-Reference Quality Assessment Model of Enhanced Images With Big Data. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:1301-1313. [PMID: 28287984 DOI: 10.1109/tnnls.2017.2649101] [Citation(s) in RCA: 54] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
In this paper, we investigate into the problem of image quality assessment (IQA) and enhancement via machine learning. This issue has long attracted a wide range of attention in computational intelligence and image processing communities, since, for many practical applications, e.g., object detection and recognition, raw images are usually needed to be appropriately enhanced to raise the visual quality (e.g., visibility and contrast). In fact, proper enhancement can noticeably improve the quality of input images, even better than originally captured images, which are generally thought to be of the best quality. In this paper, we present two most important contributions. The first contribution is to develop a new no-reference (NR) IQA model. Given an image, our quality measure first extracts 17 features through analysis of contrast, sharpness, brightness and more, and then yields a measure of visual quality using a regression module, which is learned with big-data training samples that are much bigger than the size of relevant image data sets. The results of experiments on nine data sets validate the superiority and efficiency of our blind metric compared with typical state-of-the-art full-reference, reduced-reference and NA IQA methods. The second contribution is that a robust image enhancement framework is established based on quality optimization. For an input image, by the guidance of the proposed NR-IQA measure, we conduct histogram modification to successively rectify image brightness and contrast to a proper level. Thorough tests demonstrate that our framework can well enhance natural images, low-contrast images, low-light images, and dehazed images. The source code will be released at https://sites.google.com/site/guke198701/publications.
Collapse
|
23
|
Deep Activation Pooling for Blind Image Quality Assessment. APPLIED SCIENCES-BASEL 2018. [DOI: 10.3390/app8040478] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
24
|
Xie X, Zhang Y, Wu J, Shi G, Dong W. Bag-of-words feature representation for blind image quality assessment with local quantized pattern. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2017.05.034] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
25
|
Zhou W, Yu L, Qiu W, Zhou Y, Wu M. Local gradient patterns (LGP): An effective local-statistical-feature extraction scheme for no-reference image quality assessment. Inf Sci (N Y) 2017. [DOI: 10.1016/j.ins.2017.02.049] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
26
|
Ma K, Liu W, Liu T, Wang Z, Tao D. dipIQ: Blind Image Quality Assessment by Learning-to-Rank Discriminable Image Pairs. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:3951-3964. [PMID: 28574353 DOI: 10.1109/tip.2017.2708503] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Objective assessment of image quality is fundamentally important in many image processing tasks. In this paper, we focus on learning blind image quality assessment (BIQA) models, which predict the quality of a digital image with no access to its original pristine-quality counterpart as reference. One of the biggest challenges in learning BIQA models is the conflict between the gigantic image space (which is in the dimension of the number of image pixels) and the extremely limited reliable ground truth data for training. Such data are typically collected via subjective testing, which is cumbersome, slow, and expensive. Here, we first show that a vast amount of reliable training data in the form of quality-discriminable image pairs (DIPs) can be obtained automatically at low cost by exploiting large-scale databases with diverse image content. We then learn an opinion-unaware BIQA (OU-BIQA, meaning that no subjective opinions are used for training) model using RankNet, a pairwise learning-to-rank (L2R) algorithm, from millions of DIPs, each associated with a perceptual uncertainty level, leading to a DIP inferred quality (dipIQ) index. Extensive experiments on four benchmark IQA databases demonstrate that dipIQ outperforms the state-of-the-art OU-BIQA models. The robustness of dipIQ is also significantly improved as confirmed by the group MAximum Differentiation competition method. Furthermore, we extend the proposed framework by learning models with ListNet (a listwise L2R algorithm) on quality-discriminable image lists (DIL). The resulting DIL inferred quality index achieves an additional performance gain.
Collapse
|
27
|
Kundu D, Ghadiyaram D, Bovik AC, Evans BL. No-Reference Quality Assessment of Tone-Mapped HDR Pictures. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:2957-2971. [PMID: 28333633 DOI: 10.1109/tip.2017.2685941] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Being able to automatically predict digital picture quality, as perceived by human observers, has become important in many applications where humans are the ultimate consumers of displayed visual information. Standard dynamic range (SDR) images provide 8 b/color/pixel. High dynamic range (HDR) images, which are usually created from multiple exposures of the same scene, can provide 16 or 32 b/color/pixel, but must be tonemapped to SDR for display on standard monitors. Multi-exposure fusion techniques bypass HDR creation, by fusing the exposure stack directly to SDR format while aiming for aesthetically pleasing luminance and color distributions. Here, we describe a new no-reference image quality assessment (NR IQA) model for HDR pictures that is based on standard measurements of the bandpass and on newly conceived differential natural scene statistics (NSS) of HDR pictures. We derive an algorithm from the model which we call the HDR IMAGE GRADient-based Evaluator. NSS models have previously been used to devise NR IQA models that effectively predict the subjective quality of SDR images, but they perform significantly worse on tonemapped HDR content. Toward ameliorating this we make here the following contributions: 1) we design HDR picture NR IQA models and algorithms using both standard space-domain NSS features as well as novel HDR-specific gradient-based features that significantly elevate prediction performance; 2) we validate the proposed models on a large-scale crowdsourced HDR image database; and 3) we demonstrate that the proposed models also perform well on legacy natural SDR images. The software is available at: http://live.ece.utexas.edu/research/Quality/higradeRelease.zip.
Collapse
|
28
|
Yuan Y, Yi Y, Liu J. Integrated visual quality assessment for ZiYuan-3 optical satellite panchromatic products. THE IMAGING SCIENCE JOURNAL 2017. [DOI: 10.1080/13682199.2017.1313562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
29
|
|
30
|
Zhou Y, Kwong S, Gao W, Wang X. A phase congruency based patch evaluator for complexity reduction in multi-dictionary based single-image super-resolution. Inf Sci (N Y) 2016. [DOI: 10.1016/j.ins.2016.05.024] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
31
|
|
32
|
Yang X, Sun Q, Wang T. Blind image quality assessment via probabilistic latent semantic analysis. SPRINGERPLUS 2016; 5:1714. [PMID: 27777850 PMCID: PMC5050185 DOI: 10.1186/s40064-016-3400-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2016] [Accepted: 09/27/2016] [Indexed: 11/13/2022]
Abstract
We propose a blind image quality assessment that is highly unsupervised and training free. The new method is based on the hypothesis that the effect caused by distortion can be expressed by certain latent characteristics. Combined with probabilistic latent semantic analysis, the latent characteristics can be discovered by applying a topic model over a visual word dictionary. Four distortion-affected features are extracted to form the visual words in the dictionary: (1) the block-based local histogram; (2) the block-based local mean value; (3) the mean value of contrast within a block; (4) the variance of contrast within a block. Based on the dictionary, the latent topics in the images can be discovered. The discrepancy between the frequency of the topics in an unfamiliar image and a large number of pristine images is applied to measure the image quality. Experimental results for four open databases show that the newly proposed method correlates well with human subjective judgments of diversely distorted images.
Collapse
Affiliation(s)
- Xichen Yang
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Xiaolingwei 200, Nanjing, China
| | - Quansen Sun
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Xiaolingwei 200, Nanjing, China
| | - Tianshu Wang
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Xiaolingwei 200, Nanjing, China
| |
Collapse
|
33
|
Zhang G, Xia JJ, Liebschner M, Zhang X, Kim D, Zhou X. Improved Rubin-Bodner model for the prediction of soft tissue deformations. Med Eng Phys 2016; 38:1369-1375. [PMID: 27717593 DOI: 10.1016/j.medengphy.2016.09.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Revised: 08/21/2016] [Accepted: 09/23/2016] [Indexed: 11/20/2022]
Abstract
In craniomaxillofacial (CMF) surgery, a reliable way of simulating the soft tissue deformation resulted from skeletal reconstruction is vitally important for preventing the risks of facial distortion postoperatively. However, it is difficult to simulate the soft tissue behaviors affected by different types of CMF surgery. This study presents an integrated bio-mechanical and statistical learning model to improve accuracy and reliability of predictions on soft facial tissue behavior. The Rubin-Bodner (RB) model is initially used to describe the biomechanical behavior of the soft facial tissue. Subsequently, a finite element model (FEM) computers the stress of each node in soft facial tissue mesh data resulted from bone displacement. Next, the Generalized Regression Neural Network (GRNN) method is implemented to obtain the relationship between the facial soft tissue deformation and the stress distribution corresponding to different CMF surgical types and to improve evaluation of elastic parameters included in the RB model. Therefore, the soft facial tissue deformation can be predicted by biomechanical properties and statistical model. Leave-one-out cross-validation is used on eleven patients. As a result, the average prediction error of our model (0.7035mm) is lower than those resulting from other approaches. It also demonstrates that the more accurate bio-mechanical information the model has, the better prediction performance it could achieve.
Collapse
Affiliation(s)
- Guangming Zhang
- Department of Radiology, Wake Forest University School of Medicine, Medical Center Boulevard, Winston-Salem, NC 27157, USA
| | - James J Xia
- The Methodist Hospital Research Institute, Weil Cornell Medical College, Houston, TX 77030, USA
| | - Michael Liebschner
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX 77030, USA
| | - Xiaoyan Zhang
- The Methodist Hospital Research Institute, Weil Cornell Medical College, Houston, TX 77030, USA
| | - Daeseung Kim
- The Methodist Hospital Research Institute, Weil Cornell Medical College, Houston, TX 77030, USA
| | - Xiaobo Zhou
- Department of Radiology, Wake Forest University School of Medicine, Medical Center Boulevard, Winston-Salem, NC 27157, USA.
| |
Collapse
|
34
|
Li X, Xu Q, Li B, Song X. A Highly Reliable and Cost-Efficient Multi-Sensor System for Land Vehicle Positioning. SENSORS (BASEL, SWITZERLAND) 2016; 16:E755. [PMID: 27231917 PMCID: PMC4934181 DOI: 10.3390/s16060755] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/04/2016] [Revised: 05/09/2016] [Accepted: 05/20/2016] [Indexed: 11/18/2022]
Abstract
In this paper, we propose a novel positioning solution for land vehicles which is highly reliable and cost-efficient. The proposed positioning system fuses information from the MEMS-based reduced inertial sensor system (RISS) which consists of one vertical gyroscope and two horizontal accelerometers, low-cost GPS, and supplementary sensors and sources. First, pitch and roll angle are accurately estimated based on a vehicle kinematic model. Meanwhile, the negative effect of the uncertain nonlinear drift of MEMS inertial sensors is eliminated by an H∞ filter. Further, a distributed-dual-H∞ filtering (DDHF) mechanism is adopted to address the uncertain nonlinear drift of the MEMS-RISS and make full use of the supplementary sensors and sources. The DDHF is composed of a main H∞ filter (MHF) and an auxiliary H∞ filter (AHF). Finally, a generalized regression neural network (GRNN) module with good approximation capability is specially designed for the MEMS-RISS. A hybrid methodology which combines the GRNN module and the AHF is utilized to compensate for RISS position errors during GPS outages. To verify the effectiveness of the proposed solution, road-test experiments with various scenarios were performed. The experimental results illustrate that the proposed system can achieve accurate and reliable positioning for land vehicles.
Collapse
Affiliation(s)
- Xu Li
- School of Instrument Science and Engineering, Southeast University, Nanjing 210096, China.
| | - Qimin Xu
- School of Instrument Science and Engineering, Southeast University, Nanjing 210096, China.
| | - Bin Li
- Key Laboratory of Technology on Intelligent Transportation Systems Ministry of Transport, Research Institute of Highway Ministry of Transport, Beijing 100088, China.
| | - Xianghui Song
- Key Laboratory of Technology on Intelligent Transportation Systems Ministry of Transport, Research Institute of Highway Ministry of Transport, Beijing 100088, China.
| |
Collapse
|
35
|
Shao F, Tian W, Lin W, Jiang G, Dai Q. Toward a Blind Deep Quality Evaluator for Stereoscopic Images Based on Monocular and Binocular Interactions. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2016; 25:2059-2074. [PMID: 26960225 DOI: 10.1109/tip.2016.2538462] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
During recent years, blind image quality assessment (BIQA) has been intensively studied with different machine learning tools. Existing BIQA metrics, however, do not design for stereoscopic images. We believe this problem can be resolved by separating 3D images and capturing the essential attributes of images via deep neural network. In this paper, we propose a blind deep quality evaluator (DQE) for stereoscopic images (denoted by 3D-DQE) based on monocular and binocular interactions. The key technical steps in the proposed 3D-DQE are to train two separate 2D deep neural networks (2D-DNNs) from 2D monocular images and cyclopean images to model the process of monocular and binocular quality predictions, and combine the measured 2D monocular and cyclopean quality scores using different weighting schemes. Experimental results on four public 3D image quality assessment databases demonstrate that in comparison with the existing methods, the devised algorithm achieves high consistent alignment with subjective assessment.
Collapse
|
36
|
One pass learning for generalized classifier neural network. Neural Netw 2016; 73:70-6. [DOI: 10.1016/j.neunet.2015.10.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Revised: 10/13/2015] [Accepted: 10/16/2015] [Indexed: 11/20/2022]
|
37
|
Gu K, Zhai G, Lin W, Liu M. The Analysis of Image Contrast: From Quality Assessment to Automatic Enhancement. IEEE TRANSACTIONS ON CYBERNETICS 2016; 46:284-297. [PMID: 25775503 DOI: 10.1109/tcyb.2015.2401732] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Proper contrast change can improve the perceptual quality of most images, but it has largely been overlooked in the current research of image quality assessment (IQA). To fill this void, we in this paper first report a new large dedicated contrast-changed image database (CCID2014), which includes 655 images and associated subjective ratings recorded from 22 inexperienced observers. We then present a novel reduced-reference image quality metric for contrast change (RIQMC) using phase congruency and statistics information of the image histogram. Validation of the proposed model is conducted on contrast related CCID2014, TID2008, CSIQ and TID2013 databases, and results justify the superiority and efficiency of RIQMC over a majority of classical and state-of-the-art IQA methods. Furthermore, we combine aforesaid subjective and objective assessments to derive the RIQMC based Optimal HIstogram Mapping (ROHIM) for automatic contrast enhancement, which is shown to outperform recently developed enhancement technologies.
Collapse
|
38
|
Zhang C, Pan J, Chen S, Wang T, Sun D. No reference image quality assessment using sparse feature representation in two dimensions spatial correlation. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2015.01.105] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
39
|
Sharma M, Chaudhury S, Lall B. Sparse representation based classifier to assess video quality. 2015 FIFTH NATIONAL CONFERENCE ON COMPUTER VISION, PATTERN RECOGNITION, IMAGE PROCESSING AND GRAPHICS (NCVPRIPG) 2015. [DOI: 10.1109/ncvpripg.2015.7490045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/19/2023]
|
40
|
Gao F, Tao D, Gao X, Li X. Learning to rank for blind image quality assessment. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2015; 26:2275-2290. [PMID: 25616080 DOI: 10.1109/tnnls.2014.2377181] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Blind image quality assessment (BIQA) aims to predict perceptual image quality scores without access to reference images. State-of-the-art BIQA methods typically require subjects to score a large number of images to train a robust model. However, subjective quality scores are imprecise, biased, and inconsistent, and it is challenging to obtain a large-scale database, or to extend existing databases, because of the inconvenience of collecting images, training the subjects, conducting subjective experiments, and realigning human quality evaluations. To combat these limitations, this paper explores and exploits preference image pairs (PIPs) such as the quality of image Ia is better than that of image Ib for training a robust BIQA model. The preference label, representing the relative quality of two images, is generally precise and consistent, and is not sensitive to image content, distortion type, or subject identity; such PIPs can be generated at a very low cost. The proposed BIQA method is one of learning to rank. We first formulate the problem of learning the mapping from the image features to the preference label as one of classification. In particular, we investigate the utilization of a multiple kernel learning algorithm based on group lasso to provide a solution. A simple but effective strategy to estimate perceptual image quality scores is then presented. Experiments show that the proposed BIQA method is highly effective and achieves a performance comparable with that of state-of-the-art BIQA algorithms. Moreover, the proposed method can be easily extended to new distortion categories.
Collapse
|
41
|
K. Alilou V, Yaghmaee F. Application of GRNN neural network in non-texture image inpainting and restoration. Pattern Recognit Lett 2015. [DOI: 10.1016/j.patrec.2015.04.020] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
42
|
Bovik AC. A feature-enriched completely blind image quality evaluator. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:2579-91. [PMID: 25915960 DOI: 10.1109/tip.2015.2426416] [Citation(s) in RCA: 161] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Existing blind image quality assessment (BIQA) methods are mostly opinion-aware. They learn regression models from training images with associated human subjective scores to predict the perceptual quality of test images. Such opinion-aware methods, however, require a large amount of training samples with associated human subjective scores and of a variety of distortion types. The BIQA models learned by opinion-aware methods often have weak generalization capability, hereby limiting their usability in practice. By comparison, opinion-unaware methods do not need human subjective scores for training, and thus have greater potential for good generalization capability. Unfortunately, thus far no opinion-unaware BIQA method has shown consistently better quality prediction accuracy than the opinion-aware methods. Here, we aim to develop an opinion-unaware BIQA method that can compete with, and perhaps outperform, the existing opinion-aware methods. By integrating the features of natural image statistics derived from multiple cues, we learn a multivariate Gaussian model of image patches from a collection of pristine natural images. Using the learned multivariate Gaussian model, a Bhattacharyya-like distance is used to measure the quality of each image patch, and then an overall quality score is obtained by average pooling. The proposed BIQA method does not need any distorted sample images nor subjective quality scores for training, yet extensive experiments demonstrate its superior quality-prediction performance to the state-of-the-art opinion-aware BIQA methods. The MATLAB source code of our algorithm is publicly available at www.comp.polyu.edu.hk/~cslzhang/IQA/ILNIQE/ILNIQE.htm.
Collapse
|
43
|
|
44
|
Saha A, Wu QMJ. Utilizing image scales towards totally training free blind image quality assessment. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:1879-1892. [PMID: 25775489 DOI: 10.1109/tip.2015.2411436] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
A new approach to blind image quality assessment (BIQA), requiring no training, is proposed in this paper. The approach is named as blind image quality evaluator based on scales and works by evaluating the global difference of the query image analyzed at different scales with the query image at original resolution. The approach is based on the ability of the natural images to exhibit redundant information over various scales. A distorted image is considered as a deviation from the natural image and bereft of the redundancy present in the original image. The similarity of the original resolution image with its down-scaled version will decrease more when the image is distorted more. Therefore, the dissimilarities of an image with its low-resolution versions are cumulated in the proposed method. We dissolve the query image into its scale-space and measure the global dissimilarity with the co-occurrence histograms of the original and its scaled images. These scaled images are the low pass versions of the original image. The dissimilarity, called low pass error, is calculated by comparing the low pass versions across scales with the original image. The high pass versions of the image in different scales are obtained by Wavelet decomposition and their dissimilarity from the original image is also calculated. This dissimilarity, called high pass error, is computed with the variance and gradient histograms and weighted by the contrast sensitivity function to make it perceptually effective. These two kinds of dissimilarities are combined together to derive the quality score of the query image. This method requires absolutely no training with the distorted image, pristine images, or subjective human scores to predict the perceptual quality but uses the intrinsic global change of the query image across scales. The performance of the proposed method is evaluated across six publicly available databases and found to be competitive with the state-of-the-art techniques.
Collapse
|
45
|
Hou W, Gao X, Tao D, Li X. Blind image quality assessment via deep learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2015; 26:1275-1286. [PMID: 25122842 DOI: 10.1109/tnnls.2014.2336852] [Citation(s) in RCA: 90] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
This paper investigates how to blindly evaluate the visual quality of an image by learning rules from linguistic descriptions. Extensive psychological evidence shows that humans prefer to conduct evaluations qualitatively rather than numerically. The qualitative evaluations are then converted into the numerical scores to fairly benchmark objective image quality assessment (IQA) metrics. Recently, lots of learning-based IQA models are proposed by analyzing the mapping from the images to numerical ratings. However, the learnt mapping can hardly be accurate enough because some information has been lost in such an irreversible conversion from the linguistic descriptions to numerical scores. In this paper, we propose a blind IQA model, which learns qualitative evaluations directly and outputs numerical scores for general utilization and fair comparison. Images are represented by natural scene statistics features. A discriminative deep model is trained to classify the features into five grades, corresponding to five explicit mental concepts, i.e., excellent, good, fair, poor, and bad. A newly designed quality pooling is then applied to convert the qualitative labels into scores. The classification framework is not only much more natural than the regression-based models, but also robust to the small sample size problem. Thorough experiments are conducted on popular databases to verify the model's effectiveness, efficiency, and robustness.
Collapse
|
46
|
|
47
|
|
48
|
Barri A, Dooms A, Jansen B, Schelkens P. A locally adaptive system for the fusion of objective quality measures. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2014; 23:2446-2458. [PMID: 24733011 DOI: 10.1109/tip.2014.2316379] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Objective measures to automatically predict the perceptual quality of images or videos can reduce the time and cost requirements of end-to-end quality monitoring. For reliable quality predictions, these objective quality measures need to respond consistently with the behavior of the human visual system (HVS). In practice, many important HVS mechanisms are too complex to be modeled directly. Instead, they can be mimicked by machine learning systems, trained on subjective quality assessment databases, and applied on predefined objective quality measures for specific content or distortion classes. On the downside, machine learning systems are often difficult to interpret and may even contradict the input objective quality measures, leading to unreliable quality predictions. To address this problem, we developed an interpretable machine learning system for objective quality assessment, namely the locally adaptive fusion (LAF). This paper describes the LAF system and compares its performance with traditional machine learning. As it turns out, the LAF system is more consistent with the input measures and can better handle heteroscedastic training data.
Collapse
|
49
|
Gao X, Gao F, Tao D, Li X. Universal blind image quality assessment metrics via natural scene statistics and multiple kernel learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2013; 24:2013-26. [PMID: 24805219 DOI: 10.1109/tnnls.2013.2271356] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Universal blind image quality assessment (IQA) metrics that can work for various distortions are of great importance for image processing systems, because neither ground truths are available nor the distortion types are aware all the time in practice. Existing state-of-the-art universal blind IQA algorithms are developed based on natural scene statistics (NSS). Although NSS-based metrics obtained promising performance, they have some limitations: 1) they use either the Gaussian scale mixture model or generalized Gaussian density to predict the nonGaussian marginal distribution of wavelet, Gabor, or discrete cosine transform coefficients. The prediction error makes the extracted features unable to reflect the change in nonGaussianity (NG) accurately. The existing algorithms use the joint statistical model and structural similarity to model the local dependency (LD). Although this LD essentially encodes the information redundancy in natural images, these models do not use information divergence to measure the LD. Although the exponential decay characteristic (EDC) represents the property of natural images that large/small wavelet coefficient magnitudes tend to be persistent across scales, which is highly correlated with image degradations, it has not been applied to the universal blind IQA metrics; and 2) all the universal blind IQA metrics use the same similarity measure for different features for learning the universal blind IQA metrics, though these features have different properties. To address the aforementioned problems, we propose to construct new universal blind quality indicators using all the three types of NSS, i.e., the NG, LD, and EDC, and incorporating the heterogeneous property of multiple kernel learning (MKL). By analyzing how different distortions affect these statistical properties, we present two universal blind quality assessment models, NSS global scheme and NSS two-step scheme. In the proposed metrics: 1) we exploit the NG of natural images using the original marginal distribution of wavelet coefficients; 2) we measure correlations between wavelet coefficients using mutual information defined in information theory; 3) we use features of EDC in universal blind image quality prediction directly; and 4) we introduce MKL to measure the similarity of different features using different kernels. Thorough experimental results on the Laboratory for Image and Video Engineering database II and the Tampere Image Database2008 demonstrate that both metrics are in remarkably high consistency with the human perception, and overwhelm representative universal blind algorithms as well as some standard full reference quality indexes for various types of distortions.
Collapse
|
50
|
Abstract
Image quality assessment (IQA) has been a topic of intense research over the last several decades. With each year comes an increasing number of new IQA algorithms, extensions of existing IQA algorithms, and applications of IQA to other disciplines. In this article, I first provide an up-to-date review of research in IQA, and then I highlight several open challenges in this field. The first half of this article provides discuss key properties of visual perception, image quality databases, existing full-reference, no-reference, and reduced-reference IQA algorithms. Yet, despite the remarkable progress that has been made in IQA, many fundamental challenges remain largely unsolved. The second half of this article highlights some of these challenges. I specifically discuss challenges related to lack of complete perceptual models for: natural images, compound and suprathreshold distortions, and multiple distortions, and the interactive effects of these distortions on the images. I also discuss challenges related to IQA of images containing nontraditional, and I discuss challenges related to the computational efficiency. The goal of this article is not only to help practitioners and researchers
keep abreast of the recent advances in IQA, but to also raise awareness of the key limitations of current IQA knowledge.
Collapse
|