1
|
Alharbi FM, Besbes FR, Almutairi BS, Alotaibi AT, Fatani FF, Besbes HR. Fusion of Magnetic Resonance Elastography Images With Computed Tomography and Magnetic Resonance Imaging Using the Human Visual System. Cureus 2023; 15:e45109. [PMID: 37842423 PMCID: PMC10569364 DOI: 10.7759/cureus.45109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/12/2023] [Indexed: 10/17/2023] Open
Abstract
Magnetic resonance elastography (MRE) is used to assess the stiffness of the liver to rule out cirrhosis or fibrosis. The image, nevertheless, is regarded as shear-wave imaging and does not depict any anatomical features. Multimodality medical image fusion (MMIF), such as the fusion of MRE with computed tomography (CT) scan or magnetic resonance imaging (MRI), can help doctors optimize the advantages of each imaging technique. As a result, perceptions serve as valid and valuable assessment criteria. The contrast sensitivity function (CSF), which describes the rates of visual contrast sensitivity through the changing of spatial frequencies, is used mathematically to characterize the human visual system (HVS). As a result, we suggest novel methods for fusing images that use discrete wavelets transform (DWT) based on HVS and CSF models. Images from MRI or CT scan were combined with MRE images, and the outcomes were assessed both subjectively and objectively. Visual inspection of merging images was done throughout the qualitative analysis. The CT-MRE fused images in all four datasets were shown to be superior at maintaining bones and spatial resolution, despite the MRI-MRE being better at exhibiting soft tissues and contrast resolution. It is clear from all four datasets that the liver soft tissue in MRI and CT images mixed successfully with the red-colored stiffness distribution seen in MRE images. The proposed approach outperformed DWT, which produced visual artifacts such as signal loss. Quantitative evaluation using mean, standard deviation, and entropy showed that the generated images from the proposed technique performed better than the source images and DWT. Additionally, peak signal-to-noise ratio, mean square error, correlation coefficient, and structural similarity index measure were employed to compare the two fusion approaches, namely, MRI-MRE and CT-MRE. The comparison did not show the superiority of one approach over the other. In conclusion, both subjective and objective evaluation approaches revealed that the combined images contained more information and characteristics. Hence, the proposed method might be a useful procedure to diagnose and localize the stiffness regions on the liver soft tissue by fusion of MRE with MRI or CT.
Collapse
|
2
|
Jin Y, Patney A, Webb R, Bovik AC. FOVQA: Blind Foveated Video Quality Assessment. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:4571-4584. [PMID: 35767478 DOI: 10.1109/tip.2022.3185738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Previous blind or No Reference (NR) Image / video quality assessment (IQA/VQA) models largely rely on features drawn from natural scene statistics (NSS), but under the assumption that the image statistics are stationary in the spatial domain. Several of these models are quite successful on standard pictures. However, in Virtual Reality (VR) applications, foveated video compression is regaining attention, and the concept of space-variant quality assessment is of interest, given the availability of increasingly high spatial and temporal resolution contents and practical ways of measuring gaze direction. Distortions from foveated video compression increase with increased eccentricity, implying that the natural scene statistics are space-variant. Towards advancing the development of foveated compression / streaming algorithms, we have devised a no-reference (NR) foveated video quality assessment model, called FOVQA, which is based on new models of space-variant natural scene statistics (NSS) and natural video statistics (NVS). Specifically, we deploy a space-variant generalized Gaussian distribution (SV-GGD) model and a space-variant asynchronous generalized Gaussian distribution (SV-AGGD) model of mean subtracted contrast normalized (MSCN) coefficients and products of neighboring MSCN coefficients, respectively. We devise a foveated video quality predictor that extracts radial basis features, and other features that capture perceptually annoying rapid quality fall-offs. We find that FOVQA achieves state-of-the-art (SOTA) performance on the new 2D LIVE-FBT-FCVR database, as compared with other leading Foveated IQA / VQA models. we have made our implementation of FOVQA available at: https://live.ece.utexas.edu/research/Quality/FOVQA.zip.
Collapse
|
3
|
Ding K, Ma K, Wang S, Simoncelli EP. Image Quality Assessment: Unifying Structure and Texture Similarity. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:2567-2581. [PMID: 33338012 DOI: 10.1109/tpami.2020.3045810] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Objective measures of image quality generally operate by comparing pixels of a "degraded" image to those of the original. Relative to human observers, these measures are overly sensitive to resampling of texture regions (e.g., replacing one patch of grass with another). Here, we develop the first full-reference image quality model with explicit tolerance to texture resampling. Using a convolutional neural network, we construct an injective and differentiable function that transforms images to multi-scale overcomplete representations. We demonstrate empirically that the spatial averages of the feature maps in this representation capture texture appearance, in that they provide a set of sufficient statistical constraints to synthesize a wide variety of texture patterns. We then describe an image quality method that combines correlations of these spatial averages ("texture similarity") with correlations of the feature maps ("structure similarity"). The parameters of the proposed measure are jointly optimized to match human ratings of image quality, while minimizing the reported distances between subimages cropped from the same texture images. Experiments show that the optimized method explains human perceptual scores, both on conventional image quality databases, as well as on texture databases. The measure also offers competitive performance on related tasks such as texture classification and retrieval. Finally, we show that our method is relatively insensitive to geometric transformations (e.g., translation and dilation), without use of any specialized training or data augmentation. Code is available at https://github.com/dingkeyan93/DISTS.
Collapse
|
4
|
Abstract
Image quality assessment (IQA) models aim to establish a quantitative relationship between visual images and their quality as perceived by human observers. IQA modeling plays a special bridging role between vision science and engineering practice, both as a test-bed for vision theories and computational biovision models and as a powerful tool that could potentially have a profound impact on a broad range of image processing, computer vision, and computer graphics applications for design, optimization, and evaluation purposes. The growth of IQA research has accelerated over the past two decades. In this review, we present an overview of IQA methods from a Bayesian perspective, with the goals of unifying a wide spectrum of IQA approaches under a common framework and providing useful references to fundamental concepts accessible to vision scientists and image processing practitioners. We discuss the implications of the successes and limitations of modern IQA methods for biological vision and the prospect for vision science to inform the design of future artificial vision systems. (The detailed model taxonomy can be found at http://ivc.uwaterloo.ca/research/bayesianIQA/.) Expected final online publication date for the Annual Review of Vision Science, Volume 7 is September 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Zhengfang Duanmu
- Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada; , , ,
| | - Wentao Liu
- Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada; , , ,
| | - Zhongling Wang
- Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada; , , ,
| | - Zhou Wang
- Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada; , , ,
| |
Collapse
|
5
|
Jin Y, Chen M, Goodall T, Patney A, Bovik AC. Subjective and Objective Quality Assessment of 2D and 3D Foveated Video Compression in Virtual Reality. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:5905-5919. [PMID: 34125674 DOI: 10.1109/tip.2021.3087322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In Virtual Reality (VR), the requirements of much higher resolution and smooth viewing experiences under rapid and often real-time changes in viewing direction, leads to significant challenges in compression and communication. To reduce the stresses of very high bandwidth consumption, the concept of foveated video compression is being accorded renewed interest. By exploiting the space-variant property of retinal visual acuity, foveation has the potential to substantially reduce video resolution in the visual periphery, with hardly noticeable perceptual quality degradations. Accordingly, foveated image / video quality predictors are also becoming increasingly important, as a practical way to monitor and control future foveated compression algorithms. Towards advancing the development of foveated image / video quality assessment (FIQA / FVQA) algorithms, we have constructed 2D and (stereoscopic) 3D VR databases of foveated / compressed videos, and conducted a human study of perceptual quality on each database. Each database includes 10 reference videos and 180 foveated videos, which were processed by 3 levels of foveation on the reference videos. Foveation was applied by increasing compression with increased eccentricity. In the 2D study, each video was of resolution 7680×3840 and was viewed and quality-rated by 36 subjects, while in the 3D study, each video was of resolution 5376×5376 and rated by 34 subjects. Both studies were conducted on top of a foveated video player having low motion-to-photon latency (~50ms). We evaluated different objective image and video quality assessment algorithms, including both FIQA / FVQA algorithms and non-foveated algorithms, on our so called LIVE-Facebook Technologies Foveation-Compressed Virtual Reality (LIVE-FBT-FCVR) databases. We also present a statistical evaluation of the relative performances of these algorithms. The LIVE-FBT-FCVR databases have been made publicly available and can be accessed at https://live.ece.utexas.edu/research/LIVEFBTFCVR/index.html.
Collapse
|
6
|
Ding K, Ma K, Wang S, Simoncelli EP. Comparison of Full-Reference Image Quality Models for Optimization of Image Processing Systems. Int J Comput Vis 2021; 129:1258-1281. [PMID: 33495671 PMCID: PMC7817470 DOI: 10.1007/s11263-020-01419-7] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Accepted: 12/08/2020] [Indexed: 11/29/2022]
Abstract
The performance of objective image quality assessment (IQA) models has been evaluated primarily by comparing model predictions to human quality judgments. Perceptual datasets gathered for this purpose have provided useful benchmarks for improving IQA methods, but their heavy use creates a risk of overfitting. Here, we perform a large-scale comparison of IQA models in terms of their use as objectives for the optimization of image processing algorithms. Specifically, we use eleven full-reference IQA models to train deep neural networks for four low-level vision tasks: denoising, deblurring, super-resolution, and compression. Subjective testing on the optimized images allows us to rank the competing models in terms of their perceptual performance, elucidate their relative advantages and disadvantages in these tasks, and propose a set of desirable properties for incorporation into future IQA models.
Collapse
Affiliation(s)
- Keyan Ding
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
| | - Kede Ma
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
| | - Shiqi Wang
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
| | - Eero P Simoncelli
- Howard Hughes Medical Institute, Center for Neural Science, and Courant Institute of Mathematical Sciences, New York University, New York, USA
| |
Collapse
|
7
|
Liu F, Ahanonu EL, Marcellin MW, Lin Y, Ashok A, Bilgin A. Visibility of Quantization Errors in Reversible JPEG2000. SIGNAL PROCESSING. IMAGE COMMUNICATION 2020; 84:115812. [PMID: 32205917 PMCID: PMC7088451 DOI: 10.1016/j.image.2020.115812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Image compression systems that exploit the properties of the human visual system have been studied extensively over the past few decades. For the JPEG2000 image compression standard, all previous methods that aim to optimize perceptual quality have considered the irreversible pipeline of the standard. In this work, we propose an approach for the reversible pipeline of the JPEG2000 standard. We introduce a new methodology to measure visibility of quantization errors when reversible color and wavelet transforms are employed. Incorporation of the visibility thresholds using this methodology into a JPEG2000 encoder enables creation of scalable codestreams that can provide both near-threshold and numerically lossless representations, which is desirable in applications where restoration of original image samples is required. Most importantly, this is the first work that quantifies the bitrate penalty incurred by the reversible transforms in near-threshold image compression compared to the irreversible transforms.
Collapse
Affiliation(s)
- Feng Liu
- College of Electronic Information and Optical Engineering, Nankai University, Tianjin, 300350, People’s Republic of China
| | - Eze L. Ahanonu
- Department of Electrical and Computer Engineering, University of Arizona, Tucson, AZ 85721, USA
| | - Michael W. Marcellin
- Department of Electrical and Computer Engineering, University of Arizona, Tucson, AZ 85721, USA
| | - Yuzhang Lin
- Department of Electrical and Computer Engineering, University of Arizona, Tucson, AZ 85721, USA
| | - Amit Ashok
- Department of Electrical and Computer Engineering, University of Arizona, Tucson, AZ 85721, USA
- College of Optical Sciences, University of Arizona, Tucson, AZ 85721, USA
| | - Ali Bilgin
- Department of Electrical and Computer Engineering, University of Arizona, Tucson, AZ 85721, USA
- Department of Biomedical Engineering, University of Arizona, Tucson, AZ 85721, USA
| |
Collapse
|
8
|
Abstract
Imperceptibility and robustness are the two complementary, but fundamental requirements of any digital image watermarking method. To improve the invisibility and robustness of multiplicative image watermarking, a complex wavelet based watermarking algorithm is proposed by using the human visual texture masking and visual saliency model. First, image blocks with high entropy are selected as the watermark embedding space to achieve imperceptibility. Then, an adaptive multiplicative watermark embedding strength factor is designed by utilizing texture masking and visual saliency to enhance robustness. Furthermore, the complex wavelet coefficients of the low frequency sub-band are modeled by a Gaussian distribution, and a watermark decoding method is proposed based on the maximum likelihood criterion. Finally, the effectiveness of the watermarking is validated by using the peak signal-to-noise ratio (PSNR) and the structural similarity index measure (SSIM) through experiments. Simulation results demonstrate the invisibility of the proposed method and its strong robustness against various attacks, including additive noise, image filtering, JPEG compression, amplitude scaling, rotation attack, and combinational attack.
Collapse
|
9
|
Liu J, Wu S, Xu X. A Logarithmic Quantization-Based Image Watermarking Using Information Entropy in the Wavelet Domain. ENTROPY 2018; 20:e20120945. [PMID: 33266669 PMCID: PMC7512558 DOI: 10.3390/e20120945] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Revised: 12/03/2018] [Accepted: 12/05/2018] [Indexed: 11/16/2022]
Abstract
Conventional quantization-based watermarking may be easily estimated by averaging on a set of watermarked signals via uniform quantization approach. Moreover, the conventional quantization-based method neglects the visual perceptual characteristics of the host signal; thus, the perceptible distortions would be introduced in some parts of host signal. In this paper, inspired by the Watson's entropy masking model and logarithmic quantization index modulation (LQIM), a logarithmic quantization-based image watermarking method is developed by using the wavelet transform. Furthermore, the novel method improves the robustness of watermarking based on a logarithmic quantization strategy, which embeds the watermark data into the image blocks with high entropy value. The main significance of this work is that the trade-off between invisibility and robustness is simply addressed by using the logarithmic quantizaiton approach, which applies the entropy masking model and distortion-compensated scheme to develop a watermark embedding method. In this manner, the optimal quantization parameter obtained by minimizing the quantization distortion function effectively controls the watermark strength. In terms of watermark decoding, we model the wavelet coefficients of image by the generalized Gaussian distribution (GGD) and calculate the bit error probability of proposed method. Performance of the proposed method is analyzed and verified by simulation on real images. Experimental results demonstrate that the proposed method has the advantages of imperceptibility and strong robustness against attacks covering JPEG compression, additive white Gaussian noise (AWGN), Gaussian filtering, Salt&Peppers noise, scaling and rotation attack, etc.
Collapse
|
10
|
Hadizadeh H, Heravi AR, Bajic IV, Karami P. A Perceptual Distinguishability Predictor For JND-noise-contaminated Images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 28:2242-2256. [PMID: 30507532 DOI: 10.1109/tip.2018.2883893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Just noticeable difference (JND) models are widely used for perceptual redundancy estimation in images and videos. A common method for measuring the accuracy of a JND model is to inject random noise in an image based on the JND model, and check whether the JND-noise-contaminated image is perceptually distinguishable from the original image or not. Also, when comparing the accuracy of two different JND models, the model that produces the JND-noise-contaminated image with better quality at the same level of noise energy is the better model. But in both of these cases, a subjective test is necessary, which is very time consuming and costly. In this paper, we present a full-reference metric called PDP (perceptual distinguishability predictor), which can be used to determine whether a given JND-noise-contaminated image is perceptually distinguishable from the reference image. The proposed metric employs the concept of sparse coding, and extracts a feature vector out of a given image pair. The feature vector is then fed to a multilayer neural network for classification. To train the network, we built a public database of 999 natural images with distinguishbility thresholds for four different JND models obtained from an extensive subjective experiment. The results indicated that PDD achieves high classification accuracy of 97.1%. The proposed method can be used to objectively compare various JND models without performing any subjective test. It can also be used to obtain proper scaling factors to improve the JND thresholds estimated by an arbitrary JND model.
Collapse
|
11
|
Abstract
To improve the invisibility and robustness of the multiplicative watermarking algorithm, an adaptive image watermarking algorithm is proposed based on the visual saliency model and Laplacian distribution in the wavelet domain. The algorithm designs an adaptive multiplicative watermark strength factor by utilizing the energy aggregation of the high-frequency wavelet sub-band, texture masking and visual saliency characteristics. Then, the image blocks with high-energy are selected as the watermark embedding space to implement the imperceptibility of the watermark. In terms of watermark detection, the Laplacian distribution model is used to model the wavelet coefficients, and a blind watermark detection approach is exploited based on the maximum likelihood scheme. Finally, this paper performs the simulation analysis and comparison of the performance of the proposed algorithm. Experimental results show that the proposed algorithm is robust against additive white Gaussian noise, JPEG compression, median filtering, scaling, rotation attack and other attacks.
Collapse
|
12
|
Quantization-Based Image Watermarking by Using a Normalization Scheme in the Wavelet Domain. INFORMATION 2018. [DOI: 10.3390/info9080194] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
To improve the invisibility and robustness of quantization-based image watermarking algorithms, we developed an improved quantization image watermarking method based on the wavelet transform and normalization strategy used in this study. In the process of watermark encoding, the sorting strategy of wavelet coefficients is used to calculate the quantization step size. Its robustness lies in the normalization-based watermark embedding and the control of its amount of modification on each wavelet coefficient by utilizing the proper quantization parameter in a high entropy image region. In watermark detection, the original unmarked image is not required, and the probability of false alarms and the probability of detection are discussed through experimental simulation. Experimental results show the effectiveness of the proposed watermarking. Furthermore, the proposed method has stronger robustness than the alternative quantization-based watermarking algorithm.
Collapse
|
13
|
Abstract
It has been known that human visual systems (HVSs) can be applied to describe the underlying masking properties for the image processing. In general, HVS can only perceive small changes in a scene when they are greater than the just noticeable distortion (JND) threshold. Recently, the cognitive resources of huma visual attention mechanisms are limited, which can not concentrate on all stimuli. To be specific, only more important stimuli will react from the mechanisms. When it comes to visual attention mechanisms, we need to introduce the visual saliency to model the human perception more accurately. In this paper, we presents a new wavelet-based JND estimation method that takes into account the interrelationship between visual saliency and JND threshold. In the experimental part, we verify it from both subjective and objective aspects. In addition, the experimental results show that extracting the saliency map of the image in the discrete wavelet transform (DWT) domain and then modulating its JND threshold is better than the non-modulated JND effect.
Collapse
|
14
|
Ding Y, Zhao Y. No-reference stereoscopic image quality assessment guided by visual hierarchical structure and binocular effects. APPLIED OPTICS 2018; 57:2610-2621. [PMID: 29714248 DOI: 10.1364/ao.57.002610] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2017] [Accepted: 03/04/2018] [Indexed: 06/08/2023]
Abstract
Stereoscopic image quality assessment (SIQA) is an essential technique for modern 3D image and video processing systems serving as performance evaluators and monitors. However, the study on SIQA remains immature due to the complexity of the human visual system (HVS) and binocular effects that binocular vision brings about. To overcome the difficulties, a novel method is proposed that extracts and quantifies image quality-aware features related to cortex areas in charge of visual quality perception, rather than attempting to rigorously simulate the biological processing in HVS, so that the predicting accuracy is preserved while the computational complexity remains moderate. Meanwhile, binocular effects including binocular rivalry and visual discomfort are taken into consideration. Moreover, the proposed method can be operated completely without the assistance of reference images, indicating its wide practical usages. Compared to state-of-the-art works, our method shows evident superiority in terms of effectiveness and robustness.
Collapse
|
15
|
Abstract
Image sizes have increased exponentially in recent years. The resulting high-resolution images are often viewed via remote image browsing. Zooming and panning are desirable features in this context, which result in disparate spatial regions of an image being displayed at a variety of (spatial) resolutions. When an image is displayed at a reduced resolution, the quantization step sizes needed for visually lossless quality generally increase. This paper investigates the quantization step sizes needed for visually lossless display as a function of resolution, and proposes a method that effectively incorporates the resulting (multiple) quantization step sizes into a single JPEG2000 codestream. This codestream is JPEG2000 Part 1 compliant and allows for visually lossless decoding at all resolutions natively supported by the wavelet transform as well as arbitrary intermediate resolutions, using only a fraction of the full-resolution codestream. When images are browsed remotely using the JPEG2000 Interactive Protocol (JPIP), the required bandwidth is significantly reduced, as demonstrated by extensive experimental results.
Collapse
Affiliation(s)
- Han Oh
- National Satellite Operation and Application Center, Korea Aerospace
Research Institute (KARI); 169-84 Gwahak-ro, Yuseong-gu, Daejeon, 34133, Republic of
Korea
| | - Ali Bilgin
- Department of Biomedical Engineering, The University of Arizona;
1127 E. James E. Rogers Way, Tucson, AZ, 85721, U.S.A
- Department of Electrical and Computer Engineering, The University of
Arizona; 1230 E. Speedway Blvd, Tucson, AZ, 85721, U.S.A
| | - Michael Marcellin
- Department of Electrical and Computer Engineering, The University of
Arizona; 1230 E. Speedway Blvd, Tucson, AZ, 85721, U.S.A
- Correspondence: ;
Tel.: +1-520-621-6190
| |
Collapse
|
16
|
Zhang W, Liu H. Toward a Reliable Collection of Eye-Tracking Data for Image Quality Research: Challenges, Solutions, and Applications. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:2424-2437. [PMID: 28362586 DOI: 10.1109/tip.2017.2681424] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Image quality assessment potentially benefits from the addition of visual attention. However, incorporating aspects of visual attention in image quality models by means of a perceptually optimized strategy is largely unexplored. Fundamental challenges, such as how visual attention is affected by the concurrence of visual signals and their distortions; whether visual attention affected by distortion or that driven by the original scene only should be included in an image quality model; and how to select visual attention models for the image quality application context, remain. To shed light on the above unsolved issues, designing and performing eye-tracking experiments are essential. Collecting eye-tracking data for the purpose of image quality study is so far confronted with a bias due to the involvement of stimulus repetition. In this paper, we propose a new experimental methodology to eliminate such inherent bias. This allows obtaining reliable eye-tracking data with a large degree of stimulus variability. In fact, we first conducted 5760 eye movement trials that included 160 human observers freely viewing 288 images of varying quality. We then made use of the resulting eye-tracking data to provide insights into the optimal use of visual attention in image quality research. The new eye-tracking data are made publicly available to the research community.
Collapse
|
17
|
Hill P, Al-Mualla ME, Bull D. Perceptual Image Fusion Using Wavelets. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:1076-1088. [PMID: 27913344 DOI: 10.1109/tip.2016.2633863] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
A perceptual image fusion method is proposed that employs explicit luminance and contrast masking models. These models are combined to give the perceptual importance of each coefficient produced by the dual-tree complex wavelet transform of each input image. This combined model of perceptual importance is used to select which coefficients are retained and furthermore to determine how to present the retained information in the most effective way. This paper is the first to give a principled approach to image fusion from a perceptual perspective. Furthermore, the proposed method is shown to give improved quantitative and qualitative results compared with previously developed methods.
Collapse
|
18
|
|
19
|
Oh H, Lee S. Visual Presence: Viewing Geometry Visual Information of UHD S3D Entertainment. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2016; 25:3358-3371. [PMID: 28113720 DOI: 10.1109/tip.2016.2567099] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
To maximize the presence experienced by humans, visual content has evolved to achieve a higher visual presence in a series of high definition (HD), ultra HD (UHD), 8K UHD, and 8K stereoscopic 3D (S3D). Several studies have introduced visual presence delivered from content when viewing UHD S3D from a content analysis perspective. Nevertheless, no clear definition has been presented for visual presence, and only a subjective evaluation has been relied upon. The main reason for this is that there is a limitation to defining visual presence via the use of content information itself. In this paper, we define the visual presence for each viewing environment, and investigate a novel methodology to measure the experienced visual presence when viewing both 2D and 3D via the definition of a new metric termed volume of visual information by quantifying the influence of the viewing geometry between the display and viewer. To achieve this goal, the viewing geometry and display parameters for both flat and atypical displays are analyzed in terms of human perception by introducing a novel concept of pixel-wise geometry. In addition, perceptual weighting through analysis of content information is performed in accordance with monocular and binocular vision characteristics. In the experimental results, it is shown that the constructed model based on the viewing geometry, content, and perceptual characteristics has a high correlation of about 84% with subjective evaluations.
Collapse
|
20
|
Hill P, Achim A, Al-Mualla ME, Bull D. Contrast Sensitivity of the Wavelet, Dual Tree Complex Wavelet, Curvelet, and Steerable Pyramid Transforms. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2016; 25:2739-2751. [PMID: 27093623 DOI: 10.1109/tip.2016.2552725] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Accurate estimation of the contrast sensitivity of the human visual system is crucial for perceptually based image processing in applications such as compression, fusion and denoising. Conventional contrast sensitivity functions (CSFs) have been obtained using fixed-sized Gabor functions. However, the basis functions of multiresolution decompositions such as wavelets often resemble Gabor functions but are of variable size and shape. Therefore to use the conventional CSFs in such cases is not appropriate. We have therefore conducted a set of psychophysical tests in order to obtain the CSF for a range of multiresolution transforms: the discrete wavelet transform, the steerable pyramid, the dual-tree complex wavelet transform, and the curvelet transform. These measures were obtained using contrast variation of each transforms' basis functions in a 2AFC experiment combined with an adapted version of the QUEST psychometric function method. The results enable future image processing applications that exploit these transforms such as signal fusion, superresolution processing, denoising and motion estimation, to be perceptually optimized in a principled fashion. The results are compared with an existing vision model (HDR-VDP2) and are used to show quantitative improvements within a denoising application compared with using conventional CSF values.
Collapse
|
21
|
Zhang W, Borji A, Wang Z, Le Callet P, Liu H. The Application of Visual Saliency Models in Objective Image Quality Assessment: A Statistical Evaluation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2016; 27:1266-1278. [PMID: 26277009 DOI: 10.1109/tnnls.2015.2461603] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Advances in image quality assessment have shown the potential added value of including visual attention aspects in its objective assessment. Numerous models of visual saliency are implemented and integrated in different image quality metrics (IQMs), but the gain in reliability of the resulting IQMs varies to a large extent. The causes and the trends of this variation would be highly beneficial for further improvement of IQMs, but are not fully understood. In this paper, an exhaustive statistical evaluation is conducted to justify the added value of computational saliency in objective image quality assessment, using 20 state-of-the-art saliency models and 12 best-known IQMs. Quantitative results show that the difference in predicting human fixations between saliency models is sufficient to yield a significant difference in performance gain when adding these saliency models to IQMs. However, surprisingly, the extent to which an IQM can profit from adding a saliency model does not appear to have direct relevance to how well this saliency model can predict human fixations. Our statistical analysis provides useful guidance for applying saliency models in IQMs, in terms of the effect of saliency model dependence, IQM dependence, and image distortion dependence. The testbed and software are made publicly available to the research community.
Collapse
|
22
|
Yang J, Lin Y, Gao Z, Lv Z, Wei W, Song H. Quality Index for Stereoscopic Images by Separately Evaluating Adding and Subtracting. PLoS One 2015; 10:e0145800. [PMID: 26717412 PMCID: PMC4699220 DOI: 10.1371/journal.pone.0145800] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Accepted: 12/08/2015] [Indexed: 11/24/2022] Open
Abstract
The human visual system (HVS) plays an important role in stereo image quality perception. Therefore, it has aroused many people’s interest in how to take advantage of the knowledge of the visual perception in image quality assessment models. This paper proposes a full-reference metric for quality assessment of stereoscopic images based on the binocular difference channel and binocular summation channel. For a stereo pair, the binocular summation map and binocular difference map are computed first by adding and subtracting the left image and right image. Then the binocular summation is decoupled into two parts, namely additive impairments and detail losses. The quality of binocular summation is obtained as the adaptive combination of the quality of detail losses and additive impairments. The quality of binocular summation is computed by using the Contrast Sensitivity Function (CSF) and weighted multi-scale (MS-SSIM). Finally, the quality of binocular summation and binocular difference is integrated into an overall quality index. The experimental results indicate that compared with existing metrics, the proposed metric is highly consistent with the subjective quality assessment and is a robust measure. The result have also indirectly proved hypothesis of the existence of binocular summation and binocular difference channels.
Collapse
Affiliation(s)
- Jiachen Yang
- School of Electronic Information Engineering, Tianjin University, 92 Weijin Road, Tianjin, 300072 China
| | - Yancong Lin
- School of Electronic Information Engineering, Tianjin University, 92 Weijin Road, Tianjin, 300072 China
| | - Zhiqun Gao
- School of Electronic Information Engineering, Tianjin University, 92 Weijin Road, Tianjin, 300072 China
| | - Zhihan Lv
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, 1068 Xueyuan Avenue, Shenzhen University Town, Shenzhen, 518055 China
| | - Wei Wei
- School of Computer Science and Engineering, Xi’an University of Technology, Xi’an, Shaanxi 710048 China
| | - Houbing Song
- Department of Electrical and Computer Engineering, West Virginia University, Montgomery, WV 25136 United States of America
- * E-mail:
| |
Collapse
|
23
|
Hu S, Jin L, Wang H, Zhang Y, Kwong S, Kuo CCJ. Compressed image quality metric based on perceptually weighted distortion. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:5594-5608. [PMID: 26415170 DOI: 10.1109/tip.2015.2481319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Objective quality assessment for compressed images is critical to various image compression systems that are essential in image delivery and storage. Although the mean squared error (MSE) is computationally simple, it may not be accurate to reflect the perceptual quality of compressed images, which is also affected dramatically by the characteristics of human visual system (HVS), such as masking effect. In this paper, an image quality metric (IQM) is proposed based on perceptually weighted distortion in terms of the MSE. To capture the characteristics of HVS, a randomness map is proposed to measure the masking effect and a preprocessing scheme is proposed to simulate the processing that occurs in the initial part of HVS. Since the masking effect highly depends on the structural randomness, the prediction error from neighborhood with a statistical model is used to measure the significance of masking. Meanwhile, the imperceptible signal with high frequency could be removed by preprocessing with low-pass filters. The relation is investigated between the distortions before and after masking effect, and a masking modulation model is proposed to simulate the masking effect after preprocessing. The performance of the proposed IQM is validated on six image databases with various compression distortions. The experimental results show that the proposed algorithm outperforms other benchmark IQMs.
Collapse
|
24
|
Lee D, Plataniotis KN. Towards a Full-Reference Quality Assessment for Color Images Using Directional Statistics. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:3950-3965. [PMID: 26186778 DOI: 10.1109/tip.2015.2456419] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
This paper presents a novel computational model for quantifying the perceptual quality of color images consistently with subjective evaluations. The proposed full-reference color metric, namely, a directional statistics-based color similarity index, is designed to consistently perform well over commonly encountered chromatic and achromatic distortions. In order to accurately predict the visual quality of color images, we make use of local color descriptors extracted from three perceptual color channels: 1) hue; 2) chroma; and 3) lightness. In particular, directional statistical tools are employed to properly process hue data by considering their periodicities. Moreover, two weighting mechanisms are exploited to accurately combine locally measured comparison scores into a final score. Extensive experimentation performed on large-scale databases indicates that the proposed metric is effective across a wide range of chromatic and achromatic distortions, making it better suited for the evaluation and optimization of color image processing algorithms.
Collapse
|
25
|
Saha A, Wu QMJ. Utilizing image scales towards totally training free blind image quality assessment. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:1879-1892. [PMID: 25775489 DOI: 10.1109/tip.2015.2411436] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
A new approach to blind image quality assessment (BIQA), requiring no training, is proposed in this paper. The approach is named as blind image quality evaluator based on scales and works by evaluating the global difference of the query image analyzed at different scales with the query image at original resolution. The approach is based on the ability of the natural images to exhibit redundant information over various scales. A distorted image is considered as a deviation from the natural image and bereft of the redundancy present in the original image. The similarity of the original resolution image with its down-scaled version will decrease more when the image is distorted more. Therefore, the dissimilarities of an image with its low-resolution versions are cumulated in the proposed method. We dissolve the query image into its scale-space and measure the global dissimilarity with the co-occurrence histograms of the original and its scaled images. These scaled images are the low pass versions of the original image. The dissimilarity, called low pass error, is calculated by comparing the low pass versions across scales with the original image. The high pass versions of the image in different scales are obtained by Wavelet decomposition and their dissimilarity from the original image is also calculated. This dissimilarity, called high pass error, is computed with the variance and gradient histograms and weighted by the contrast sensitivity function to make it perceptually effective. These two kinds of dissimilarities are combined together to derive the quality score of the query image. This method requires absolutely no training with the distorted image, pristine images, or subjective human scores to predict the perceptual quality but uses the intrinsic global change of the query image across scales. The performance of the proposed method is evaluated across six publicly available databases and found to be competitive with the state-of-the-art techniques.
Collapse
|
26
|
|
27
|
|
28
|
Feng HC, Marcellin MW, Bilgin A. A methodology for visually lossless JPEG2000 compression of monochrome stereo images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:560-572. [PMID: 25532207 DOI: 10.1109/tip.2014.2384273] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
A methodology for visually lossless compression of monochrome stereoscopic 3D images is proposed. Visibility thresholds are measured for quantization distortion in JPEG2000. These thresholds are found to be functions of not only spatial frequency, but also of wavelet coefficient variance, as well as the gray level in both the left and right images. To avoid a daunting number of measurements during subjective experiments, a model for visibility thresholds is developed. The left image and right image of a stereo pair are then compressed jointly using the visibility thresholds obtained from the proposed model to ensure that quantization errors in each image are imperceptible to both eyes. This methodology is then demonstrated via a particular 3D stereoscopic display system with an associated viewing condition. The resulting images are visually lossless when displayed individually as 2D images, and also when displayed in stereoscopic 3D mode.
Collapse
|
29
|
On the performance of video quality assessment metrics under different compression and packet loss scenarios. ScientificWorldJournal 2014; 2014:743604. [PMID: 24982988 PMCID: PMC4055130 DOI: 10.1155/2014/743604] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2014] [Accepted: 04/14/2014] [Indexed: 11/18/2022] Open
Abstract
When comparing the performance of video coding approaches, evaluating different commercial video encoders, or measuring the perceived video quality in a wireless environment, Rate/distortion analysis is commonly used, where distortion is usually measured in terms of PSNR values. However, PSNR does not always capture the distortion perceived by a human being. As a consequence, significant efforts have focused on defining an objective video quality metric that is able to assess quality in the same way as a human does. We perform a study of some available objective quality assessment metrics in order to evaluate their behavior in two different scenarios. First, we deal with video sequences compressed by different encoders at different bitrates in order to properly measure the video quality degradation associated with the encoding system. In addition, we evaluate the behavior of the quality metrics when measuring video distortions produced by packet losses in mobile ad hoc network scenarios with variable degrees of network congestion and node mobility. Our purpose is to determine if the analyzed metrics can replace the PSNR while comparing, designing, and evaluating video codec proposals, and, in particular, under video delivery scenarios characterized by bursty and frequent packet losses, such as wireless multihop environments.
Collapse
|
30
|
Oh H, Lee S. Visually weighted reconstruction of compressive sensing MRI. Magn Reson Imaging 2014; 32:270-80. [DOI: 10.1016/j.mri.2012.11.008] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2012] [Revised: 09/28/2012] [Accepted: 11/10/2012] [Indexed: 12/01/2022]
|
31
|
You J, Ebrahimi T, Perkis A. Attention driven foveated video quality assessment. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2014; 23:200-213. [PMID: 24184726 DOI: 10.1109/tip.2013.2287611] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Contrast sensitivity of the human visual system to visual stimuli can be significantly affected by several mechanisms, e.g., vision foveation and attention. Existing studies on foveation based video quality assessment only take into account static foveation mechanism. This paper first proposes an advanced foveal imaging model to generate the perceived representation of video by integrating visual attention into the foveation mechanism. For accurately simulating the dynamic foveation mechanism, a novel approach to predict video fixations is proposed by mimicking the essential functionality of eye movement. Consequently, an advanced contrast sensitivity function, derived from the attention driven foveation mechanism, is modeled and then integrated into a wavelet-based distortion visibility measure to build a full reference attention driven foveated video quality (AFViQ) metric. AFViQ exploits adequately perceptual visual mechanisms in video quality assessment. Extensive evaluation results with respect to several publicly available eye-tracking and video quality databases demonstrate promising performance of the proposed video attention model, fixation prediction approach, and quality metric.
Collapse
|
32
|
Laligant O, Truchetet F, Fauvet E. Noise estimation from digital step-model signal. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2013; 22:5158-5167. [PMID: 24058030 DOI: 10.1109/tip.2013.2282123] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
This paper addresses the noise estimation in the digital domain and proposes a noise estimator based on the step signal model. It is efficient for any distribution of noise because it does not rely only on the smallest amplitudes in the signal or image. The proposed approach uses polarized/directional derivatives and a nonlinear combination of these derivatives to estimate the noise distribution (e.g., Gaussian, Poisson, speckle, etc.). The moments of this measured distribution can be computed and are also calculated theoretically on the basis of noise distribution models. The 1D performances are detailed, and as this paper is mostly dedicated to image processing, a 2D extension is proposed. The 2D performances for several noise distributions and noise models are presented and are compared with selected other methods.
Collapse
|
33
|
Abstract
Image quality assessment (IQA) has been a topic of intense research over the last several decades. With each year comes an increasing number of new IQA algorithms, extensions of existing IQA algorithms, and applications of IQA to other disciplines. In this article, I first provide an up-to-date review of research in IQA, and then I highlight several open challenges in this field. The first half of this article provides discuss key properties of visual perception, image quality databases, existing full-reference, no-reference, and reduced-reference IQA algorithms. Yet, despite the remarkable progress that has been made in IQA, many fundamental challenges remain largely unsolved. The second half of this article highlights some of these challenges. I specifically discuss challenges related to lack of complete perceptual models for: natural images, compound and suprathreshold distortions, and multiple distortions, and the interactive effects of these distortions on the images. I also discuss challenges related to IQA of images containing nontraditional, and I discuss challenges related to the computational efficiency. The goal of this article is not only to help practitioners and researchers
keep abreast of the recent advances in IQA, but to also raise awareness of the key limitations of current IQA knowledge.
Collapse
|
34
|
Oh H, Bilgin A, Marcellin MW. Visually lossless encoding for JPEG2000. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2013; 22:189-201. [PMID: 22949058 DOI: 10.1109/tip.2012.2215616] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Due to exponential growth in image sizes, visually lossless coding is increasingly being considered as an alternative to numerically lossless coding, which has limited compression ratios. This paper presents a method of encoding color images in a visually lossless manner using JPEG2000. In order to hide coding artifacts caused by quantization, visibility thresholds (VTs) are measured and used for quantization of subband signals in JPEG2000. The VTs are experimentally determined from statistically modeled quantization distortion, which is based on the distribution of wavelet coefficients and the dead-zone quantizer of JPEG2000. The resulting VTs are adjusted for locally changing backgrounds through a visual masking model, and then used to determine the minimum number of coding passes to be included in the final codestream for visually lossless quality under the desired viewing conditions. Codestreams produced by this scheme are fully JPEG2000 Part-I compliant.
Collapse
Affiliation(s)
- Han Oh
- Digital Imaging Business Division, Samsung Electronics, Suwon 443-803, Korea.
| | | | | |
Collapse
|
35
|
De I, Sil J. Entropy based fuzzy classification of images on quality assessment. JOURNAL OF KING SAUD UNIVERSITY - COMPUTER AND INFORMATION SCIENCES 2012. [DOI: 10.1016/j.jksuci.2012.05.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
36
|
Zhang F, Liu W, Lin W, Ngan KN. Spread spectrum image watermarking based on perceptual quality metric. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2011; 20:3207-3218. [PMID: 21518660 DOI: 10.1109/tip.2011.2146263] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Efficient image watermarking calls for full exploitation of the perceptual distortion constraint. Second-order statistics of visual stimuli are regarded as critical features for perception. This paper proposes a second-order statistics (SOS)-based image quality metric, which considers the texture masking effect and the contrast sensitivity in Karhunen-Loève transform domain. Compared with the state-of-the-art metrics, the quality prediction by SOS better correlates with several subjectively rated image databases, in which the images are impaired by the typical coding and watermarking artifacts. With the explicit metric definition, spread spectrum watermarking is posed as an optimization problem: we search for a watermark to minimize the distortion of the watermarked image and to maximize the correlation between the watermark pattern and the spread spectrum carrier. The simple metric guarantees the optimal watermark a closed-form solution and a fast implementation. The experiments show that the proposed watermarking scheme can take full advantage of the distortion constraint and improve the robustness in return.
Collapse
Affiliation(s)
- Fan Zhang
- Huazhong University of Science and Technology, Hubei, China.
| | | | | | | |
Collapse
|
37
|
Wang Z, Li Q. Information content weighting for perceptual image quality assessment. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2011; 20:1185-1198. [PMID: 21078577 DOI: 10.1109/tip.2010.2092435] [Citation(s) in RCA: 87] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Many state-of-the-art perceptual image quality assessment (IQA) algorithms share a common two-stage structure: local quality/distortion measurement followed by pooling. While significant progress has been made in measuring local image quality/distortion, the pooling stage is often done in ad-hoc ways, lacking theoretical principles and reliable computational models. This paper aims to test the hypothesis that when viewing natural images, the optimal perceptual weights for pooling should be proportional to local information content, which can be estimated in units of bit using advanced statistical models of natural images. Our extensive studies based upon six publicly-available subject-rated image databases concluded with three useful findings. First, information content weighting leads to consistent improvement in the performance of IQA algorithms. Second, surprisingly, with information content weighting, even the widely criticized peak signal-to-noise-ratio can be converted to a competitive perceptual quality measure when compared with state-of-the-art algorithms. Third, the best overall performance is achieved by combining information content weighting with multiscale structural similarity measures.
Collapse
Affiliation(s)
- Zhou Wang
- Department of Electrical and Computer Engineering, Universityof Waterloo, Waterloo, ON, N2L 3G1, Canada.
| | | |
Collapse
|
38
|
Cui L, Li W. Adaptive multiwavelet-based watermarking through JPW masking. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2011; 20:1047-1060. [PMID: 20876023 DOI: 10.1109/tip.2010.2079551] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
In this paper, a multibit, multiplicative, spread spectrum watermarking using the discrete multiwavelet (including unbalanced and balanced multiwavelet) transform is presented. Performance improvement with respect to existing algorithm is obtained by means of a new just perceptual weighting (JPW) model. The new model incorporates various masking effects of human visual perception by taking into account the eye's sensitivity to noise changes depending on spatial frequency, luminance and texture of all the image subbands. In contrast to conventional JND threshold model, JPW describing minimum perceptual sensitivity weighting to noise changes, is fitter for nonadditive watermarking. Specifically, watermarking strength is adaptively adjusted to obtain minimum perceptual distortion by employing the JPW model. Correspondingly, an adaptive optimum decoding is derived using a statistic model based on generalized-Gaussian distribution (GGD) for multiwavelet coefficients of the cover-image. Furthermore, the impact of multiwavelet characteristics on proposed watermarking scheme is also analyzed. Finally, the experimental results show that proposed JPW model can improve the quality of the watermarked image and give more robustness of the watermark as compared with a variety of state-of-the-art algorithms.
Collapse
Affiliation(s)
- Lihong Cui
- Department of Mathematics and Computer Science, Beijing University of Chemical Technology, Beijing 100080, China.
| | | |
Collapse
|
39
|
Kumar B, Singh SP, Mohan A, Anand A. Novel MOS prediction models for compressed medical image quality. J Med Eng Technol 2011; 35:161-71. [DOI: 10.3109/03091902.2011.558169] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
40
|
Rouse DM, Hemami SS, Pépion R, Le Callet P. Estimating the usefulness of distorted natural images using an image contour degradation measure. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2011; 28:157-188. [PMID: 21293521 DOI: 10.1364/josaa.28.000157] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Quality estimators aspire to quantify the perceptual resemblance, but not the usefulness, of a distorted image when compared to a reference natural image. However, humans can successfully accomplish tasks (e.g., object identification) using visibly distorted images that are not necessarily of high quality. A suite of novel subjective experiments reveals that quality does not accurately predict utility (i.e., usefulness). Thus, even accurate quality estimators cannot accurately estimate utility. In the absence of utility estimators, leading quality estimators are assessed as both quality and utility estimators and dismantled to understand those image characteristics that distinguish utility from quality. A newly proposed utility estimator demonstrates that a measure of contour degradation is sufficient to accurately estimate utility and is argued to be compatible with shape-based theories of object perception.
Collapse
Affiliation(s)
- David M Rouse
- Visual Communications Laboratory, School of Electrical and Computer Engineering, Cornell University, 356 Rhodes Hall, Ithaca, New York 14850, USA.
| | | | | | | |
Collapse
|
41
|
Islam MI, Begum N, Alam M, Amin M. Fingerprint Detection Using Canny Filter and DWT, a New Approach. JOURNAL OF INFORMATION PROCESSING SYSTEMS 2010. [DOI: 10.3745/jips.2010.6.4.511] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
42
|
Chou CH, Liu KC. A perceptually tuned watermarking scheme for color images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2010; 19:2966-2982. [PMID: 20529748 DOI: 10.1109/tip.2010.2052261] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Transparency and robustness are two conflicting requirements demanded by digital image watermarking for copyright protection and many other purposes. A feasible way to simultaneously satisfy the two conflicting requirements is to embed high-strength watermark signals in the host signals that can accommodate the distortion due to watermark insertion as part of perceptual redundancy. The search of distortion-tolerable host signals for watermark insertion and the determination of watermark strength are hence crucial to the realization of a transparent yet robust watermark. This paper presents a color image watermarking scheme that hides watermark signals in most distortion-tolerable signals within three color channels of the host image without resulting in perceivable distortion. The distortion-tolerable host signals or the signals that possess high perceptual redundancy are sought in the wavelet domain for watermark insertion. A visual model based on the CIEDE2000 color difference equation is used to measure the perceptual redundancy inherent in each wavelet coefficient of the host image. By means of quantization index modulation, binary watermark signals are embedded in qualified wavelet coefficients. To reinforce the robustness, the watermark signals are repeated and permuted before embedding, and restored by the majority-vote decision making process in watermark extraction. Original images are not required in watermark extraction. Only a small amount of information including locations of qualified coefficients and the data associated with coefficient quantization is needed for watermark extraction. Experimental results show that the embedded watermark is transparent and quite robust in face of various attacks such as cropping, low-pass filtering, scaling, media filtering, white-noise addition as well as the JPEG and JPEG2000 coding at high compression ratios.
Collapse
|
43
|
Bae SH, Pappas TN, Juang BH. Subjective evaluation of spatial resolution and quantization noise tradeoffs. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2009; 18:495-508. [PMID: 19150799 DOI: 10.1109/tip.2008.2009796] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Most full-reference fidelity/quality metrics compare the original image to a distorted image at the same resolution assuming a fixed viewing condition. However, in many applications, such as video streaming, due to the diversity of channel capacities and display devices, the viewing distance and the spatiotemporal resolution of the displayed signal may be adapted in order to optimize the perceived signal quality. For example, at low bitrate coding applications an observer may prefer to reduce the resolution or increase the viewing distance to reduce the visibility of the compression artifacts. The tradeoff between resolution/viewing conditions and visibility of compression artifacts requires new approaches for the evaluation of image quality that account for both image distortions and image size. In order to better understand such tradeoffs, we conducted subjective tests using two representative still image coders, JPEG and JPEG 2000. Our results indicate that an observer would indeed prefer a lower spatial resolution (at a fixed viewing distance) in order to reduce the visibility of the compression artifacts, but not all the way to the point where the artifacts are completely invisible. Moreover, the observer is willing to accept more artifacts as the image size decreases. The subjective test results we report can be used to select viewing conditions for coding applications. They also set the stage for the development of novel fidelity metrics. The focus of this paper is on still images, but it is expected that similar tradeoffs apply to video.
Collapse
Affiliation(s)
- Soo Hyun Bae
- Center for Signal and Image Processing, Georgia Institute of Technology, Atlanta, GA 30332-0250, USA.
| | | | | |
Collapse
|
44
|
Brooks AC, Zhao X, Pappas TN. Structural similarity quality metrics in a coding context: exploring the space of realistic distortions. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2008; 17:1261-1273. [PMID: 18632337 DOI: 10.1109/tip.2008.926161] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Perceptual image quality metrics have explicitly accounted for human visual system (HVS) sensitivity to subband noise by estimating just noticeable distortion (JND) thresholds. A recently proposed class of quality metrics, known as structural similarity metrics (SSIM), models perception implicitly by taking into account the fact that the HVS is adapted for extracting structural information from images. We evaluate SSIM metrics and compare their performance to traditional approaches in the context of realistic distortions that arise from compression and error concealment in video compression/transmission applications. In order to better explore this space of distortions, we propose models for simulating typical distortions encountered in such applications. We compare specific SSIM implementations both in the image space and the wavelet domain; these include the complex wavelet SSIM (CWSSIM), a translation-insensitive SSIM implementation. We also propose a perceptually weighted multiscale variant of CWSSIM, which introduces a viewing distance dependence and provides a natural way to unify the structural similarity approach with the traditional JND-based perceptual approaches.
Collapse
Affiliation(s)
- Alan C Brooks
- Defensive Systems Division, Northrop Grumman Corporation, Rolling Meadows, IL 60008, USA.
| | | | | |
Collapse
|
45
|
|
46
|
André T, Antonini M, Barlaud M, Gray RM. Entropy-based distortion measure and bit allocation for wavelet image compression. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2007; 16:3058-3064. [PMID: 18092603 DOI: 10.1109/tip.2007.909408] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
|
47
|
Karam LJ, Lam TT. Selective error detection for error-resilient wavelet-based image coding. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2007; 16:2936-2942. [PMID: 18092593 DOI: 10.1109/tip.2007.909321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
This paper introduces the concept of a similarity check function for error-resilient multimedia data transmission. The proposed similarity check function provides information about the effects of corrupted data on the quality of the reconstructed image. The degree of data corruption is measured by the similarity check function at the receiver, without explicit knowledge of the original source data. The design of a perceptual similarity check function is presented for wavelet-based coders such as the JPEG2000 standard, and used with a proposed "progressive similarity-based ARQ" (ProS-ARQ) scheme to significantly decrease the retransmission rate of corrupted data while maintaining very good visual quality of images transmitted over noisy channels. Simulation results with JPEG2000-coded images transmitted over the Binary Symmetric Channel, show that the proposed ProS-ARQ scheme significantly reduces the number of retransmissions as compared to conventional ARQ-based schemes. The presented results also show that, for the same number of retransmitted data packets, the proposed ProS-ARQ scheme can achieve significantly higher PSNR and better visual quality as compared to the selective-repeat ARQ scheme.
Collapse
Affiliation(s)
- Lina J Karam
- Department of Electrical Engineering, Arizona State University, Tempe, AZ 85287-5706, USA.
| | | |
Collapse
|
48
|
Chandler DM, Hemami SS. VSNR: a wavelet-based visual signal-to-noise ratio for natural images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2007; 16:2284-98. [PMID: 17784602 DOI: 10.1109/tip.2007.901820] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
This paper presents an efficient metric for quantifying the visual fidelity of natural images based on near-threshold and suprathreshold properties of human vision. The proposed metric, the visual signal-to-noise ratio (VSNR), operates via a two-stage approach. In the first stage, contrast thresholds for detection of distortions in the presence of natural images are computed via wavelet-based models of visual masking and visual summation in order to determine whether the distortions in the distorted image are visible. If the distortions are below the threshold of detection, the distorted image is deemed to be of perfect visual fidelity (VSNR = infinity) and no further analysis is required. If the distortions are suprathreshold, a second stage is applied which operates based on the low-level visual property of perceived contrast, and the mid-level visual property of global precedence. These two properties are modeled as Euclidean distances in distortion-contrast space of a multiscale wavelet decomposition, and VSNR is computed based on a simple linear sum of these distances. The proposed VSNR metric is generally competitive with current metrics of visual fidelity; it is efficient both in terms of its low computational complexity and in terms of its low memory requirements; and it operates based on physical luminances and visual angle (rather than on digital pixel values and pixel-based dimensions) to accommodate different viewing conditions.
Collapse
Affiliation(s)
- Damon M Chandler
- School of Electrical and Computer engineering, Oklahoma State University, Stillwater, OK 74078, USA.
| | | |
Collapse
|
49
|
Gaubatz MD, Hemami SS. Efficient entropy estimation based on doubly stochastic models for quantized wavelet image data. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2007; 16:967-81. [PMID: 17405430 DOI: 10.1109/tip.2007.891784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Under a rate constraint, wavelet-based image coding involves strategic discarding of information such that the remaining data can be described with a given amount of rate. In a practical coding system, this task requires knowledge of the relationship between quantization step size and compressed rate for each group of wavelet coefficients, the R-Q curve. A common approach to this problem is to fit each subband with a scalar probability distribution and compute entropy estimates based on the model. This approach is not effective at rates below 1.0 bits-per-pixel because the distributions of quantized data do not reflect the dependencies in coefficient magnitudes. These dependencies can be addressed with doubly stochastic models, which have been previously proposed to characterize more localized behavior, though there are tradeoffs between storage, computation time, and accuracy. Using a doubly stochastic generalized Gaussian model, it is demonstrated that the relationship between step size and rate is accurately described by a low degree polynomial in the logarithm of the step size. Based on this observation, an entropy estimation scheme is presented which offers an excellent tradeoff between speed and accuracy; after a simple data-gathering step, estimates are computed instantaneously by evaluating a single polynomial for each group of wavelet coefficients quantized with the same step size. These estimates are on average within 3% of a desired target rate for several of state-of-the-art coders.
Collapse
Affiliation(s)
- Matthew D Gaubatz
- Department of Electrical and Computer Engineering, Comell University, Ithaca, NY 14853, USA.
| | | |
Collapse
|
50
|
Gaubatz MD, Hemami SS. Ordering for embedded coding of wavelet image data based on arbitrary scalar quantization schemes. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2007; 16:982-96. [PMID: 17405431 DOI: 10.1109/tip.2007.891793] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Many modern wavelet quantization schemes specify wavelet coefficient step sizes as continuous functions of an input step-size selection criterion; rate control is achieved by selecting an appropriate set of step sizes. In embedded wavelet coders, however, rate control is achieved simply by truncating the coded bit stream at the desired rate. The order in which wavelet data are coded implicitly controls quantization step sizes applied to create the reconstructed image. Since these step sizes are effectively discontinuous, piecewise-constant functions of rate, this paper examines the problem of designing a coding order for such a coder, guided by a quantization scheme where step sizes evolve continuously with rate. In particular, it formulates an optimization problem that minimizes the average relative difference between the piecewise-constant implicit step sizes associated with a layered coding strategy and the smooth step sizes given by a quantization scheme. The solution to this problem implies a coding order. Elegant, near-optimal solutions are presented to optimize step sizes over a variety of regions of rates, either continuous or discrete. This method can be used to create layers of coded data using any scalar quantization scheme combined with any wavelet bit-plane coder. It is illustrated using a variety of state-of-the-art coders and quantization schemes. In addition, the proposed method is verified with objective and subjective testing.
Collapse
Affiliation(s)
- Matthew D Gaubatz
- Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY 14853, USA.
| | | |
Collapse
|