1
|
Zhang Q, Zhou J, Zhu L, Sun W, Xiao C, Zheng WS. Unsupervised Intrinsic Image Decomposition Using Internal Self-Similarity Cues. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:9669-9686. [PMID: 34813466 DOI: 10.1109/tpami.2021.3129795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Recent learning-based intrinsic image decomposition methods have achieved remarkable progress. However, they usually require massive ground truth intrinsic images for supervised learning, which limits their applicability on real-world images since obtaining ground truth intrinsic decomposition for natural images is very challenging. In this paper, we present an unsupervised framework that is able to learn the decomposition effectively from a single natural image by training solely with the image itself. Our approach is built upon the observations that the reflectance of a natural image typically has high internal self-similarity of patches, and a convolutional generation network tends to boost the self-similarity of an image when trained for image reconstruction. Based on the observations, an unsupervised intrinsic decomposition network (UIDNet) consisting of two fully convolutional encoder-decoder sub-networks, i.e., reflectance prediction network (RPN) and shading prediction network (SPN), is devised to decompose an image into reflectance and shading by promoting the internal self-similarity of the reflectance component, in a way that jointly trains RPN and SPN to reproduce the given image. A novel loss function is also designed to make effective the training for intrinsic decomposition. Experimental results on three benchmark real-world datasets demonstrate the superiority of the proposed method.
Collapse
|
2
|
Forsyth D, Rock JJ. Intrinsic Image Decomposition Using Paradigms. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:7624-7637. [PMID: 34648429 DOI: 10.1109/tpami.2021.3119551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Intrinsic image decomposition is the task of mapping image to albedo and shading. Classical approaches derive methods from spatial models. The modern literature stresses evaluation, by comparing predictions to human judgements ("lighter", "same as", "darker"). The best modern intrinsic image methods train a map from image to albedo using images rendered from computer graphics models and example human judgements. This approach yields practical methods, but obtaining rendered images can be inconvenient. Furthermore, the approach cannot explain how a one could learn to recover intrinsic images without geometric, surface and illumination models, as people and animals appear to do. This paper describes a method that learns intrinsic image decomposition without seeing human annotations, rendered data, or ground truth data. Instead, the method relies on paradigms - spatial models of albedo and of shading. Rather than finding the "best" albedo and shading for an image via optimization, our approach trains a neural network on synthetic images. The synthetic images are constructed by multiplying albedos and shading fields sampled from our models. The network is subject to a novel smoothing procedure that ensures good behavior at short scales on real images. An averaging procedure ensures that reported albedo and shading are largely equivariant - different crops and scalings of an image will report the same albedo and shading at shared points. This averaging procedure controls long scale error. The standard evaluation for an intrinsic image method is a WHDR score. Our method achieves WHDR scores competitive with those of strong recent methods allowed to see training WHDR annotations, rendered data, and ground truth data. Our method produces albedo and shading maps with attractive qualitative properties - for example, albedo fields do not suppress wood grain and represent narrow grooves in surfaces well. Because our method is unsupervised, we can compute estimates of the test/train variance of WHDR scores; these are quite large, and suggest is unsafe to rely small differences in reported WHDR.
Collapse
|
3
|
Bahraini T, Hamedani T, Hosseini SM, Yazdi HS. Edge preserving range image smoothing using hybrid locally kernel-based weighted least square. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.109234] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
4
|
Sang Y, Zhang S, He H, Li Q, Zhang X. Brightness-gradient difference feature guided shadow removal method. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
5
|
Liang L, Jin L, Xu Y. PDE Learning of Filtering and Propagation for Task-Aware Facial Intrinsic Image Analysis. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:1021-1034. [PMID: 32459622 DOI: 10.1109/tcyb.2020.2989610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Filtering and propagation are two basic operations in image analysis and rendering, and they are also widely used in computer graphics and machine learning. However, the models of filtering and propagation were based on diverse mathematical formulations, which have not been fully understood. This article aims to explore the properties of both filtering and propagation models from a partial differential equation (PDE) learning perspective. We propose a unified PDE learning framework based on nonlinear reaction-diffusion with a guided map, graph Laplacian, and reaction weight. It reveals that: 1) the guided map and reaction weight determines whether the PDE produces filtering or propagation diffusion and 2) the kernel of graph Laplacian controls the diffusion pattern. Based on the proposed PDE framework, we derive the mathematical relations between different models, including learning to diffusion (LTD) model, label propagation, edit propagation, and edge-aware filter. In practical verification, we apply the PDE framework to design diffusion operations with the adaptive kernel to tackle the ill-posed problem of facial intrinsic image analysis (FIIA). A flexible task-aware FIIA system is built to achieve various facial rendering effects, such as face image relighting and delighting, artistic illumination transfer, illumination-aware face swapping, or transfiguring. Qualitative and quantitative experiments show the effectiveness and flexibility of task-aware FIIA and provide new insights on PDE learning for visual analysis and rendering.
Collapse
|
6
|
Gu Q, Su J, Yuan L. Visual affordance detection using an efficient attention convolutional neural network. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.01.018] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
7
|
Baslamisli AS, Gevers T. Invariant descriptors for intrinsic reflectance optimization. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2021; 38:887-896. [PMID: 34143158 DOI: 10.1364/josaa.414682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 05/11/2021] [Indexed: 06/12/2023]
Abstract
Intrinsic image decomposition aims to factorize an image into albedo (reflectance) and shading (illumination) sub-components. Being ill posed and under-constrained, it is a very challenging computer vision problem. There are infinite pairs of reflectance and shading images that can reconstruct the same input. To address the problem, Intrinsic Images in the Wild by Bell et al. provides an optimization framework based on a dense conditional random field (CRF) formulation that considers long-range material relations. We improve upon their model by introducing illumination invariant image descriptors: color ratios. The color ratios and the intrinsic reflectance are both invariant to illumination and thus are highly correlated. Through detailed experiments, we provide ways to inject the color ratios into the dense CRF optimization. Our approach is physics based and learning free and leads to more accurate and robust reflectance decompositions.
Collapse
|
8
|
Lerer A, Supèr H, Keil MS. Dynamic decorrelation as a unifying principle for explaining a broad range of brightness phenomena. PLoS Comput Biol 2021; 17:e1007907. [PMID: 33901165 PMCID: PMC8102013 DOI: 10.1371/journal.pcbi.1007907] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2020] [Revised: 05/06/2021] [Accepted: 04/06/2021] [Indexed: 11/29/2022] Open
Abstract
The visual system is highly sensitive to spatial context for encoding luminance patterns. Context sensitivity inspired the proposal of many neural mechanisms for explaining the perception of luminance (brightness). Here we propose a novel computational model for estimating the brightness of many visual illusions. We hypothesize that many aspects of brightness can be explained by a dynamic filtering process that reduces the redundancy in edge representations on the one hand, while non-redundant activity is enhanced on the other. The dynamic filter is learned for each input image and implements context sensitivity. Dynamic filtering is applied to the responses of (model) complex cells in order to build a gain control map. The gain control map then acts on simple cell responses before they are used to create a brightness map via activity propagation. Our approach is successful in predicting many challenging visual illusions, including contrast effects, assimilation, and reverse contrast with the same set of model parameters.
Collapse
Affiliation(s)
- Alejandro Lerer
- Departament de Cognició, Desenvolupament i Psicologia de l’Educació, Faculty of Psychology, University of Barcelona, Barcelona, Spain
| | - Hans Supèr
- Departament de Cognició, Desenvolupament i Psicologia de l’Educació, Faculty of Psychology, University of Barcelona, Barcelona, Spain
- Institut de Neurociències, Universitat de Barcelona, Barcelona, Spain
- Institut de Recerca Pediàtrica Hospital Sant Joan de Déu, Barcelona, Spain
- Catalan Institute for Advanced Studies (ICREA), Barcelona, Spain
| | - Matthias S. Keil
- Departament de Cognició, Desenvolupament i Psicologia de l’Educació, Faculty of Psychology, University of Barcelona, Barcelona, Spain
- Institut de Neurociències, Universitat de Barcelona, Barcelona, Spain
| |
Collapse
|
9
|
De A, Horwitz GD. Spatial receptive field structure of double-opponent cells in macaque V1. J Neurophysiol 2021; 125:843-857. [PMID: 33405995 DOI: 10.1152/jn.00547.2020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The spatial processing of color is important for visual perception. Double-opponent (DO) cells likely contribute to this processing by virtue of their spatially opponent and cone-opponent receptive fields (RFs). However, the representation of visual features by DO cells in the primary visual cortex of primates is unclear because the spatial structure of their RFs has not been fully characterized. To fill this gap, we mapped the RFs of DO cells in awake macaques with colorful, dynamic white noise patterns. The spatial RF of each neuron was fitted with a Gabor function and three versions of the difference of Gaussians (DoG) function. The Gabor function provided the more accurate description for most DO cells, a result that is incompatible with a center-surround RF organization. A nonconcentric version of the DoG function, in which the RFs have a circular center and a crescent-shaped surround, performed nearly as well as the Gabor model thus reconciling results from previous reports. For comparison, we also measured the RFs of simple cells. We found that the superiority of the Gabor fits over DoG fits was slightly more decisive for simple cells than for DO cells. The implications of these results on biological image processing and visual perception are discussed.NEW & NOTEWORTHY Double-opponent cells in macaque area V1 respond to spatial chromatic contrast in visual scenes. What information they carry is debated because their receptive field organization has not been characterized thoroughly. Using white noise analysis and statistical model comparisons, De and Horwitz show that many double-opponent receptive fields can be captured by either a Gabor model or a center-with-an-asymmetric-surround model but not by a difference of Gaussians model.
Collapse
Affiliation(s)
- Abhishek De
- Systems Neurobiology Laboratories, Salk Institute for Biological Studies, La Jolla, California.,Department of Physiology and Biophysics, Washington National Primate Research Center, University of Washington, Seattle, Washington
| | - Gregory D Horwitz
- Department of Physiology and Biophysics, Washington National Primate Research Center, University of Washington, Seattle, Washington
| |
Collapse
|
10
|
Asif M, Song H, Chen L, Yang J, Frangi AF. Intrinsic layer based automatic specular reflection detection in endoscopic images. Comput Biol Med 2020; 128:104106. [PMID: 33221640 DOI: 10.1016/j.compbiomed.2020.104106] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 11/04/2020] [Accepted: 11/04/2020] [Indexed: 01/10/2023]
Abstract
Endoscopic images are used to observe the internal structure of the human body. Specular reflection (SR) images are mostly a consequence of the strong light and bright regions appearing on endoscopic images, which affects the performance of minimally invasive surgery. In this study, we propose a novel method for automatic SR detection based on intrinsic image layer separation (IILS). The proposed method consists of three steps. Initially, it involves the normalization of the image followed by the extraction of high gradient area, and the separation of SR is done on the basis of the color model. The image melding technique is utilized to reconstruct the reflected pixels. The experiments were conducted on 912 endoscopic images from CVC-EndoSceneStill. The results of accuracy, sensitivity, specificity, precision, Jaccard index, Dice coefficient, standard deviation, and pixel count difference show that the detection performance of the proposed method outperforms that of other state-of-the-art methods. The evaluation of the proposed IILS-based SR detection demonstrates that our method obtains better qualitative and quantitative assessments compared with other methods, which can be used as a promising preprocessing step for further analysis of endoscopic images.
Collapse
Affiliation(s)
| | - Hong Song
- Beijing Institute of Technology, Beijing, China.
| | - Lei Chen
- Beijing Institute of Technology, Beijing, China
| | - Jian Yang
- Beijing Institute of Technology, Beijing, China.
| | | |
Collapse
|
11
|
Zuo X, Wang S, Zheng J, Pan Z, Yang R. Detailed Surface Geometry and Albedo Recovery from RGB-D Video under Natural Illumination. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2020; 42:2720-2734. [PMID: 31765304 DOI: 10.1109/tpami.2019.2955459] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This article presents a novel approach for depth map enhancement from an RGB-D video sequence. The basic idea is to exploit the photometric information in the color sequence to resolve the inherent ambiguity of shape from shading problem. Instead of making any assumption about surface albedo or controlled object motion and lighting, we use the lighting variations introduced by casual object movement. We are effectively calculating photometric stereo from a moving object under natural illuminations. One of the key technical challenges is to establish correspondences over the entire image set. We, therefore, develop a lighting insensitive robust pixel matching technique that out-performs optical flow method in presence of lighting variations. An adaptive reference frame selection procedure is introduced to get more robust to imperfect lambertian reflections. In addition, we present an expectation-maximization framework to recover the surface normal and albedo simultaneously, without any regularization term. We have validated our method on both synthetic and real datasets to show its superior performance on both surface details recovery and intrinsic decomposition.
Collapse
|
12
|
Xu J, Hou Y, Ren D, Liu L, Zhu F, Yu M, Wang H, Shao L. STAR: A Structure and Texture Aware Retinex Model. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:5022-5037. [PMID: 32167892 DOI: 10.1109/tip.2020.2974060] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Retinex theory is developed mainly to decompose an image into the illumination and reflectance components by analyzing local image derivatives. In this theory, larger derivatives are attributed to the changes in reflectance, while smaller derivatives are emerged in the smooth illumination. In this paper, we utilize exponentiated local derivatives (with an exponent γ) of an observed image to generate its structure map and texture map. The structure map is produced by been amplified with γ > 1, while the texture map is generated by been shrank with γ < 1. To this end, we design exponential filters for the local derivatives, and present their capability on extracting accurate structure and texture maps, influenced by the choices of exponents γ. The extracted structure and texture maps are employed to regularize the illumination and reflectance components in Retinex decomposition. A novel Structure and Texture Aware Retinex (STAR) model is further proposed for illumination and reflectance decomposition of a single image. We solve the STAR model by an alternating optimization algorithm. Each sub-problem is transformed into a vectorized least squares regression, with closed-form solutions. Comprehensive experiments on commonly tested datasets demonstrate that, the proposed STAR model produce better quantitative and qualitative performance than previous competing methods, on illumination and reflectance decomposition, low-light image enhancement, and color correction. The code is publicly available at https://github.com/csjunxu/STAR.
Collapse
|
13
|
Ibarra-Arenado MJ, Tjahjadi T, Pérez-Oria J. Shadow Detection in Still Road Images Using Chrominance Properties of Shadows and Spectral Power Distribution of the Illumination. SENSORS 2020; 20:s20041012. [PMID: 32069938 PMCID: PMC7070959 DOI: 10.3390/s20041012] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/30/2019] [Revised: 02/03/2020] [Accepted: 02/10/2020] [Indexed: 11/16/2022]
Abstract
A well-known challenge in vision-based driver assistance systems is cast shadows on the road, which makes fundamental tasks such as road and lane detections difficult. In as much as shadow detection relies on shadow features, in this paper, we propose a set of new chrominance properties of shadows based on the skylight and sunlight contributions to the road surface chromaticity. Six constraints on shadow and non-shadowed regions are derived from these properties. The chrominance properties and the associated constraints are used as shadow features in an effective shadow detection method intended to be integrated on an onboard road detection system where the identification of cast shadows on the road is a determinant stage. Onboard systems deal with still outdoor images; thus, the approach focuses on distinguishing shadow boundaries from material changes by considering two illumination sources: sky and sun. A non-shadowed road region is illuminated by both skylight and sunlight, whereas a shadowed one is illuminated by skylight only; thus, their chromaticity varies. The shadow edge detection strategy consists of the identification of image edges separating shadowed and non-shadowed road regions. The classification is achieved by verifying whether the pixel chrominance values of regions on both sides of the image edges satisfy the six constraints. Experiments on real traffic scenes demonstrated the effectiveness of our shadow detection system in detecting shadow edges on the road and material-change edges, outperforming previous shadow detection methods based on physical features, and showing the high potential of the new chrominance properties.
Collapse
Affiliation(s)
- Manuel José Ibarra-Arenado
- Department of Electrical and Energy Engineering, University of Cantabria, Avda. Los Castros s/n, 39005 Santander, Spain
- Correspondence: ; Tel.: +34-942-201-360; Fax: +34-942-201-385
| | - Tardi Tjahjadi
- School of Engineering, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, UK;
| | - Juan Pérez-Oria
- Department of Electronic Technology and Automatic Systems, University of Cantabria, Avda. Los Castros s/n, 39005 Santander, Spain;
| |
Collapse
|
14
|
Son M, Lee Y, Chang HS. Toward Specular Removal from Natural Images Based on Statistical Reflection Models. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:4204-4218. [PMID: 32031936 DOI: 10.1109/tip.2020.2967857] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Removing specular reflections from images is critical for improving the performance of computer vision algorithms. Recently, state-of-the-art methods have demonstrated remarkably good performance at removing specular reflections from chromatic images. These methods are typically based on the chromatic pixels assumption; therefore, they are prone to failure in the achromatic regions. This paper presents a novel method that is applicable to natural images, because it is effective for both chromatic and achromatic regions. The proposed method is based on modeling the general properties of diffuse and specular reflections in a solid convex optimization framework. Considering the physical constraints, we determine the global optimal solution using the split Bregman method. Experimental results demonstrate the effectiveness of the proposed method, particularly for the achromatic regions, and its competence as a state-of-the-art method for removing specular reflections from the chromatic regions.
Collapse
|
15
|
Sheng B, Li P, Jin Y, Tan P, Lee TY. Intrinsic Image Decomposition with Step and Drift Shading Separation. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1332-1346. [PMID: 30207961 DOI: 10.1109/tvcg.2018.2869326] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Decomposing an image into the shading and reflectance layers remains challenging due to its severely under-constrained nature. We present an approach based on illumination decomposition that recovers the intrinsic images without additional information, e.g., depth or user interaction. Our approach is based on the rationale that the shading component contains the step and drift channels simultaneously. We decompose the illumination into two channels: the step shading, corresponding to the sharp shading changes due to cast shadow or abrupt shape changes; the drift shading, accounting for the smooth shading variations due to gradual illumination changes or slow shape changes. Due to such transformation of turning the conventional assumption that shading has smoothness as reasonable prior, our model has the advantages in handling real images, especially with the cast shadows or strong shape edges. We also apply a much stricter edge classifier along with a reinforcement process to enhance our method. We formulate the problem using a two-parameter energy function and split it into two energy functions corresponding to the reflectance and step shading. Experiments on the MIT dataset, the IIW dataset and the MPI Sintel dataset have shown the success of our approach over the state-of-the-art methods.
Collapse
|
16
|
Sial HA, Baldrich R, Vanrell M. Deep intrinsic decomposition trained on surreal scenes yet with realistic light effects. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2020; 37:1-15. [PMID: 32118875 DOI: 10.1364/josaa.37.000001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Accepted: 10/25/2019] [Indexed: 06/10/2023]
Abstract
Estimation of intrinsic images still remains a challenging task due to weaknesses of ground-truth datasets, which either are too small or present non-realistic issues. On the other hand, end-to-end deep learning architectures start to achieve interesting results that we believe could be improved if important physical hints were not ignored. In this work, we present a twofold framework: (a) a flexible generation of images overcoming some classical dataset problems such as larger size jointly with coherent lighting appearance; and (b) a flexible architecture tying physical properties through intrinsic losses. Our proposal is versatile, presents low computation time, and achieves state-of-the-art results.
Collapse
|
17
|
Lerer A, Supèr H, Keil MS. Luminance gradients and non-gradients as a cue for distinguishing reflectance and illumination in achromatic images: A computational approach. Neural Netw 2018; 110:66-81. [PMID: 30496916 DOI: 10.1016/j.neunet.2018.11.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Revised: 10/26/2018] [Accepted: 11/04/2018] [Indexed: 11/28/2022]
Abstract
The brain analyses the visual world through the luminance patterns that reach the retina. Formally, luminance (as measured by the retina) is the product of illumination and reflectance. Whereas illumination is highly variable, reflectance is a physical property that characterizes each object surface. Due to memory constraints, it seems plausible that the visual system suppresses illumination patterns before object recognition takes place. Since many combinations of reflectance and illumination can give rise to identical luminance values, finding the correct reflectance value of a surface is an ill-posed problem, and it is still an open question how it is solved by the brain. Here we propose a computational approach that first learns filter kernels ("receptive fields") for slow and fast variations in luminance, respectively, from achromatic real-world images. Distinguishing between luminance gradients (slow variations) and non-gradients (fast variations) could serve to constrain the mentioned ill-posed problem. The second stage of our approach successfully segregates luminance gradients and non-gradients from real-world images. Our approach furthermore predicts that visual illusions that contain luminance gradients (such as Adelson's checker-shadow display or grating induction) may occur as a consequence of this segregation process.
Collapse
Affiliation(s)
- Alejandro Lerer
- Departament de Cognició, Desenvolupament i Psicologia de ĺEducació, Faculty of Psychology, University of Barcelona, Barcelona, Spain.
| | - Hans Supèr
- Departament de Cognició, Desenvolupament i Psicologia de ĺEducació, Faculty of Psychology, University of Barcelona, Barcelona, Spain; Institut de Neurociéncies, Universitat de Barcelona, Barcelona, Spain; Institut de Recerca Pediàtrica Hospital Sant Joan de Déu, Barcelona, Spain; Catalan Institute for Advanced Studies (ICREA), Barcelona, Spain
| | - Matthias S Keil
- Departament de Cognició, Desenvolupament i Psicologia de ĺEducació, Faculty of Psychology, University of Barcelona, Barcelona, Spain; Institut de Neurociéncies, Universitat de Barcelona, Barcelona, Spain
| |
Collapse
|
18
|
Hyperspectral Pansharpening Based on Intrinsic Image Decomposition and Weighted Least Squares Filter. REMOTE SENSING 2018. [DOI: 10.3390/rs10030445] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
19
|
Tappen MF. Learning-Based Shadow Recognition and Removal From Monochromatic Natural Images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:5811-5824. [PMID: 28796618 DOI: 10.1109/tip.2017.2737321] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
This paper addresses the problem of recognizing and removing shadows from monochromatic natural images from a learning-based perspective. Without chromatic information, shadow recognition and removal are extremely challenging in this paper, mainly due to the missing of invariant color cues. Natural scenes make this problem even harder due to the complex illumination condition and ambiguity from many near-black objects. In this paper, a learning-based shadow recognition and removal scheme is proposed to tackle the challenges above-mentioned. First, we propose to use both shadow-variant and invariant cues from illumination, texture, and odd order derivative characteristics to recognize shadows. Such features are used to train a classifier via boosting a decision tree and integrated into a conditional random field, which can enforce local consistency over pixel labels. Second, a Gaussian model is introduced to remove the recognized shadows from monochromatic natural scenes. The proposed scheme is evaluated using both qualitative and quantitative results based on a novel database of hand-labeled shadows, with comparisons to the existing state-of-the-art schemes. We show that the shadowed areas of a monochromatic image can be accurately identified using the proposed scheme, and high-quality shadow-free images can be precisely recovered after shadow removal.
Collapse
|
20
|
Meka A, Fox G, Zollhofer M, Richardt C, Theobalt C. Live User-Guided Intrinsic Video for Static Scenes. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2017; 23:2447-2454. [PMID: 28809688 DOI: 10.1109/tvcg.2017.2734425] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
We present a novel real-time approach for user-guided intrinsic decomposition of static scenes captured by an RGB-D sensor. In the first step, we acquire a three-dimensional representation of the scene using a dense volumetric reconstruction framework. The obtained reconstruction serves as a proxy to densely fuse reflectance estimates and to store user-provided constraints in three-dimensional space. User constraints, in the form of constant shading and reflectance strokes, can be placed directly on the real-world geometry using an intuitive touch-based interaction metaphor, or using interactive mouse strokes. Fusing the decomposition results and constraints in three-dimensional space allows for robust propagation of this information to novel views by re-projection. We leverage this information to improve on the decomposition quality of existing intrinsic video decomposition techniques by further constraining the ill-posed decomposition problem. In addition to improved decomposition quality, we show a variety of live augmented reality applications such as recoloring of objects, relighting of scenes and editing of material appearance.
Collapse
|
21
|
Zhang L, Yan Q, Liu Z, Zou H, Xiao C. Illumination Decomposition for Photograph With Multiple Light Sources. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:4114-4127. [PMID: 28600244 DOI: 10.1109/tip.2017.2712283] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Illumination decomposition for a single photograph is an important and challenging problem in image editing operation. In this paper, we present a novel coarse-to-fine strategy to perform illumination decomposition for photograph with multiple light sources. We first reconstruct the lighting environment of the image using the estimated geometry structure of the scene. With the position of lights, we detect the shadow regions as well as the highlights in the projected image for each light. Then, using the illumination cues from shadows, we estimate the coarse illumination decomposed image emitted by each light source. Finally, we present a light-aware illumination optimization model, which efficiently produces the finer illumination decomposition results, as well as recover the texture detail under the shadow. We validate our approach on a number of examples, and our method effectively decomposes the input image into multiple components corresponding to different light sources.
Collapse
|
22
|
Chen X, Zhu W, Zhao Y, Yu Y, Zhou Y, Yue T, Du S, Cao X. Intrinsic decomposition from a single spectral image. APPLIED OPTICS 2017; 56:5676-5684. [PMID: 29047710 DOI: 10.1364/ao.56.005676] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/07/2017] [Accepted: 05/10/2017] [Indexed: 06/07/2023]
Abstract
In this paper, we present a spectral intrinsic image decomposition (SIID) model, which is dedicated to resolve a natural scene into its purely independent intrinsic components: illumination, shading, and reflectance. By introducing spectral information, our work can solve many challenging cases, such as scenes with metameric effects, which are hard to tackle for trichromatic intrinsic image decomposition (IID), and thus offers potential benefits to many higher-level vision tasks, e.g., materials classification and recognition, shape-from-shading, and spectral image relighting. A both effective and efficient algorithm is presented to decompose a spectral image into its independent intrinsic components. To facilitate future SIID research, we present a public dataset with ground-truth illumination, shading, reflectance and specularity, and a meaningful error metric, so that the quantitative comparison becomes achievable. The experiments on this dataset and other images demonstrate the accuracy and robustness of the proposed method on diverse scenes, and reveal that more spectral channels indeed facilitate the vision task (i.e., segmentation and recognition).
Collapse
|
23
|
Nadian-Ghomsheh A, Hassanian Y, Navi K. Intrinsic Image Decomposition via Structure-Preserving Image Smoothing and Material Recognition. PLoS One 2016; 11:e0166772. [PMID: 27992431 PMCID: PMC5161468 DOI: 10.1371/journal.pone.0166772] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2016] [Accepted: 11/03/2016] [Indexed: 11/19/2022] Open
Abstract
Decoupling shading and reflectance from complex scene-images is a long-standing problem in computer vision. We introduce a framework for decomposing an image into the product of an illumination component and a reflectance component. Due to the ill-posed nature of the problem, prior information on shading and reflectance is mandatory. The proposed method adopts the premise that pixels in a region with similar chromaticity values should have the same reflectance. This assumption was used to minimize the l2 norm of the local per-pixel reflectance gradients to extract the shading and reflectance components. To obtain smooth chromatic regions, texture was treated in a new style. Texture was removed in the first step of the algorithm and the smooth image was processed for intrinsic decomposition. In the final step, texture details were added to the intrinsic components based on the material of each pixel. In addition, user-assistance was used to further refine the results. The qualitative and quantitative evaluation on the MIT intrinsic dataset indicated that the quality of intrinsic image decomposition was improved in comparison with previous methods.
Collapse
Affiliation(s)
| | - Yassin Hassanian
- Computer Science and Engineering Department, Shahid Beheshti University, Tehran, Iran
| | - Keyvan Navi
- Computer Science and Engineering Department, Shahid Beheshti University, Tehran, Iran
| |
Collapse
|
24
|
Joint Model and Observation Cues for Single-Image Shadow Detection. REMOTE SENSING 2016. [DOI: 10.3390/rs8060484] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
25
|
Barron JT, Malik J. Intrinsic Scene Properties from a Single RGB-D Image. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2016; 38:690-703. [PMID: 26959674 DOI: 10.1109/tpami.2015.2439286] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
In this paper, we present a technique for recovering a model of shape, illumination, reflectance, and shading from a single image taken from an RGB-D sensor. To do this, we extend the SIRFS ("shape, illumination and reflectance from shading") model, which recovers intrinsic scene properties from a single image. Though SIRFS works well on neatly segmented images of objects, it performs poorly on images of natural scenes which often contain occlusion and spatially-varying illumination. We therefore present Scene-SIRFS, a generalization of SIRFS in which we model a scene using a mixture of shapes and a mixture of illuminations, where those mixture components are embedded in a "soft" segmentation-like representation of the input image. We use the noisy depth maps provided by RGB-D sensors (such as the Microsoft Kinect) to guide and improve shape estimation. Our model takes as input a single RGB-D image and produces as output an improved depth map, a set of surface normals, a reflectance image, a shading image, and a spatially varying model of illumination. The output of our model can be used for graphics applications such as relighting and retargeting, or for more broad applications (recognition, segmentation) involving RGB-D images.
Collapse
|
26
|
Vo M, Narasimhan SG, Sheikh Y. Texture Illumination Separation for Single-Shot Structured Light Reconstruction. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2016; 38:390-404. [PMID: 26761742 DOI: 10.1109/tpami.2015.2443775] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Active illumination based methods have a trade-off between acquisition time and resolution of the estimated 3D shapes. Multi-shot approaches can generate dense reconstructions but require stationary scenes. Single-shot methods are applicable to dynamic objects but can only estimate sparse reconstructions and are sensitive to surface texture. We present a single-shot approach to produce dense shape reconstructions of highly textured objects illuminated by one or more projectors. The key to our approach is an image decomposition scheme that can recover the illumination image of different projectors and the texture images of the scene from their mixed appearances. We focus on three cases of mixed appearances: the illumination from one projector onto textured surface, illumination from multiple projectors onto a textureless surface, or their combined effect. Our method can accurately compute per-pixel warps from the illumination patterns and the texture template to the observed image. The texture template is obtained by interleaving the projection sequence with an all-white pattern. The estimated warps are reliable even with infrequent interleaved projection and strong object deformation. Thus, we obtain detailed shape reconstruction and dense motion tracking of the textured surfaces. The proposed method, implemented using a one camera and two projectors system, is validated on synthetic and real data containing subtle non-rigid surface deformations.
Collapse
|
27
|
Barron JT, Malik J. Shape, Illumination, and Reflectance from Shading. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2015; 37:1670-1687. [PMID: 26353003 DOI: 10.1109/tpami.2014.2377712] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
A fundamental problem in computer vision is that of inferring the intrinsic, 3D structure of the world from flat, 2D images of that world. Traditional methods for recovering scene properties such as shape, reflectance, or illumination rely on multiple observations of the same scene to overconstrain the problem. Recovering these same properties from a single image seems almost impossible in comparison-there are an infinite number of shapes, paint, and lights that exactly reproduce a single image. However, certain explanations are more likely than others: surfaces tend to be smooth, paint tends to be uniform, and illumination tends to be natural. We therefore pose this problem as one of statistical inference, and define an optimization problem that searches for the most likely explanation of a single image. Our technique can be viewed as a superset of several classic computer vision problems (shape-from-shading, intrinsic images, color constancy, illumination estimation, etc) and outperforms all previous solutions to those constituent problems.
Collapse
|
28
|
Foster DH, Amano K, Nascimento SMC. Time-lapse ratios of cone excitations in natural scenes. Vision Res 2015; 120:45-60. [PMID: 25847405 DOI: 10.1016/j.visres.2015.03.012] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Revised: 03/23/2015] [Accepted: 03/27/2015] [Indexed: 10/23/2022]
Abstract
The illumination in natural environments varies through the day. Stable inferences about surface color might be supported by spatial ratios of cone excitations from the reflected light, but their invariance has been quantified only for global changes in illuminant spectrum. The aim here was to test their invariance under natural changes in both illumination spectrum and geometry, especially in the distribution of shadows. Time-lapse hyperspectral radiance images were acquired from five outdoor vegetated and nonvegetated scenes. From each scene, 10,000 pairs of points were sampled randomly and ratios measured across time. Mean relative deviations in ratios were generally large, but when sampling was limited to short distances or moderate time intervals, they fell below the level for detecting violations in ratio invariance. When illumination changes with uneven geometry were excluded, they fell further, to levels obtained with global changes in illuminant spectrum alone. Within sampling constraints, ratios of cone excitations, and also of opponent-color combinations, provide an approximately invariant signal for stable surface-color inferences, despite spectral and geometric variations in scene illumination.
Collapse
Affiliation(s)
- David H Foster
- School of Electrical and Electronic Engineering, University of Manchester, Manchester M13 9PL, UK.
| | - Kinjiro Amano
- School of Electrical and Electronic Engineering, University of Manchester, Manchester M13 9PL, UK.
| | - Sérgio M C Nascimento
- Centre of Physics, University of Minho, Campus de Gualtar, 4710-057 Braga, Portugal.
| |
Collapse
|
29
|
|
30
|
Qu L, Tian J, Han Z, Tang Y. Pixel-wise orthogonal decomposition for color illumination invariant and shadow-free image. OPTICS EXPRESS 2015; 23:2220-2239. [PMID: 25836092 DOI: 10.1364/oe.23.002220] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
In this paper, we propose a novel, effective and fast method to obtain a color illumination invariant and shadow-free image from a single outdoor image. Different from state-of-the-art methods for shadow-free image that either need shadow detection or statistical learning, we set up a linear equation set for each pixel value vector based on physically-based shadow invariants, deduce a pixel-wise orthogonal decomposition for its solutions, and then get an illumination invariant vector for each pixel value vector on an image. The illumination invariant vector is the unique particular solution of the linear equation set, which is orthogonal to its free solutions. With this illumination invariant vector and Lab color space, we propose an algorithm to generate a shadow-free image which well preserves the texture and color information of the original image. A series of experiments on a diverse set of outdoor images and the comparisons with the state-of-the-art methods validate our method.
Collapse
|
31
|
Vladusich T, McDonnell MD. A unified account of perceptual layering and surface appearance in terms of gamut relativity. PLoS One 2014; 9:e113159. [PMID: 25402466 PMCID: PMC4234682 DOI: 10.1371/journal.pone.0113159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2014] [Accepted: 10/20/2014] [Indexed: 11/19/2022] Open
Abstract
When we look at the world--or a graphical depiction of the world--we perceive surface materials (e.g. a ceramic black and white checkerboard) independently of variations in illumination (e.g. shading or shadow) and atmospheric media (e.g. clouds or smoke). Such percepts are partly based on the way physical surfaces and media reflect and transmit light and partly on the way the human visual system processes the complex patterns of light reaching the eye. One way to understand how these percepts arise is to assume that the visual system parses patterns of light into layered perceptual representations of surfaces, illumination and atmospheric media, one seen through another. Despite a great deal of previous experimental and modelling work on layered representation, however, a unified computational model of key perceptual demonstrations is still lacking. Here we present the first general computational model of perceptual layering and surface appearance--based on a boarder theoretical framework called gamut relativity--that is consistent with these demonstrations. The model (a) qualitatively explains striking effects of perceptual transparency, figure-ground separation and lightness, (b) quantitatively accounts for the role of stimulus- and task-driven constraints on perceptual matching performance, and (c) unifies two prominent theoretical frameworks for understanding surface appearance. The model thereby provides novel insights into the remarkable capacity of the human visual system to represent and identify surface materials, illumination and atmospheric media, which can be exploited in computer graphics applications.
Collapse
Affiliation(s)
- Tony Vladusich
- Institute for Telecommunications Research, University of South Australia, Mawson Lakes, 5095, Australia
- Center for Computational Neuroscience and Neural Technology, Boston University, Boston, MA, United States of America
| | - Mark D. McDonnell
- Institute for Telecommunications Research, University of South Australia, Mawson Lakes, 5095, Australia
| |
Collapse
|
32
|
|
33
|
Sharan L, Rosenholtz R, Adelson EH. Accuracy and speed of material categorization in real-world images. J Vis 2014; 14:14.9.12. [PMID: 25122216 DOI: 10.1167/14.9.12] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
It is easy to visually distinguish a ceramic knife from one made of steel, a leather jacket from one made of denim, and a plush toy from one made of plastic. Most studies of material appearance have focused on the estimation of specific material properties such as albedo or surface gloss, and as a consequence, almost nothing is known about how we recognize material categories like leather or plastic. We have studied judgments of high-level material categories with a diverse set of real-world photographs, and we have shown (Sharan, 2009) that observers can categorize materials reliably and quickly. Performance on our tasks cannot be explained by simple differences in color, surface shape, or texture. Nor can the results be explained by observers merely performing shape-based object recognition. Rather, we argue that fast and accurate material categorization is a distinct, basic ability of the visual system.
Collapse
Affiliation(s)
- Lavanya Sharan
- Department of Brain & Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Ruth Rosenholtz
- Department of Brain & Cognitive Sciences, CSAIL, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Edward H Adelson
- Department of Brain & Cognitive Sciences, CSAIL, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
34
|
Specularity Removal for Single Image Based on Inpainting and Blending with Parameter Estimation by Neural Networks over Multiple Feature Spaces. ACTA ACUST UNITED AC 2014. [DOI: 10.4028/www.scientific.net/amm.555.773] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Specularity removal is useful for image related applications that need consistent object surface appearance. For a single image it can be more challenging problem due to presence of different shapes, sizes and colors of specular regions, which may have some parts with totally missing data. The problem can become more difficult if the specular regions having partial information grow bigger, because the exact boundaries are difficult to mark. Any region filling method can provide unusual results because the appropriate boundaries selection is important for these methods. In this work, we address this problem and propose a scheme which can handle specular regions by segmenting both types of sub-regions of specularity. Our segmentation algorithm is multistage which uses Luminance as well as principal components for the identification of specular regions. For specularity removal, we proposed a three step scheme which includes balancing illumination, inpainting and blending. Finally feed-forward neural network is proposed to estimate the tunning parameters, which not only automate the whole process but also simplifies the difficult task of choosing parameters like size of specular regions or preprocessing selection. The results demonstrates the effectiveness of the proposed method for a variety of images having natural specular reflection.
Collapse
|
35
|
Li C, Gore JC, Davatzikos C. Multiplicative intrinsic component optimization (MICO) for MRI bias field estimation and tissue segmentation. Magn Reson Imaging 2014; 32:913-23. [PMID: 24928302 DOI: 10.1016/j.mri.2014.03.010] [Citation(s) in RCA: 153] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2014] [Accepted: 03/08/2014] [Indexed: 10/25/2022]
Abstract
This paper proposes a new energy minimization method called multiplicative intrinsic component optimization (MICO) for joint bias field estimation and segmentation of magnetic resonance (MR) images. The proposed method takes full advantage of the decomposition of MR images into two multiplicative components, namely, the true image that characterizes a physical property of the tissues and the bias field that accounts for the intensity inhomogeneity, and their respective spatial properties. Bias field estimation and tissue segmentation are simultaneously achieved by an energy minimization process aimed to optimize the estimates of the two multiplicative components of an MR image. The bias field is iteratively optimized by using efficient matrix computations, which are verified to be numerically stable by matrix analysis. More importantly, the energy in our formulation is convex in each of its variables, which leads to the robustness of the proposed energy minimization algorithm. The MICO formulation can be naturally extended to 3D/4D tissue segmentation with spatial/sptatiotemporal regularization. Quantitative evaluations and comparisons with some popular softwares have demonstrated superior performance of MICO in terms of robustness and accuracy.
Collapse
Affiliation(s)
- Chunming Li
- Center of Biomedical Image Computing and Analytics, University of Pennsylvania, Philadelphia 19104, USA.
| | - John C Gore
- Vanderbilt University Institute of Imaging Science, Vanderbilt University, Nashville, TN 37232, USA
| | - Christos Davatzikos
- Center of Biomedical Image Computing and Analytics, University of Pennsylvania, Philadelphia 19104, USA
| |
Collapse
|
36
|
|
37
|
Shen L, Yeo C, Hua BS. Intrinsic image decomposition using a sparse representation of reflectance. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2013; 35:2904-2915. [PMID: 24136429 DOI: 10.1109/tpami.2013.136] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Intrinsic image decomposition is an important problem that targets the recovery of shading and reflectance components from a single image. While this is an ill-posed problem on its own, we propose a novel approach for intrinsic image decomposition using reflectance sparsity priors that we have developed. Our sparse representation of reflectance is based on a simple observation: Neighboring pixels with similar chromaticities usually have the same reflectance. We formalize and apply this sparsity constraint on local reflectance to construct a data-driven second-generation wavelet representation. We show that the reflectance component of natural images is sparse in this representation. We further propose and formulate a global sparse constraint on reflectance colors using the assumption that each natural image uses a small set of material colors. Using this sparse reflectance representation and the global constraint on a sparse set of reflectance colors, we formulate a constrained l₁-norm minimization problem for intrinsic image decomposition that can be solved efficiently. Our algorithm can successfully extract intrinsic images from a single image without using color models or any user interaction. Experimental results on a variety of images demonstrate the effectiveness of the proposed technique.
Collapse
Affiliation(s)
- Li Shen
- Institute for Infocomm Research, Singapore
| | | | | |
Collapse
|
38
|
Chen X, Wu H, Jin X, Zhao Q. Face illumination manipulation using a single reference image by adaptive layer decomposition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2013; 22:4249-4259. [PMID: 23807447 DOI: 10.1109/tip.2013.2271548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
This paper proposes a novel image-based framework to manipulate the illumination of human face through adaptive layer decomposition. According to our framework, only a single reference image, without any knowledge of the 3D geometry or material information of the input face, is needed. To transfer the illumination effects of a reference face image to a normal lighting face, we first decompose the lightness layers of the reference and the input images into large-scale and detail layers through weighted least squares (WLS) filter with adaptive smoothing parameters according to the gradient values of the face images. The large-scale layer of the reference image is filtered with the guidance of the input image by guided filter with adaptive smoothing parameters according to the face structures. The relit result is obtained by replacing the largescale layer of the input image with that of the reference image. To normalize the illumination effects of a non-normal lighting face (i.e., face delighting), we introduce similar reflectance prior to the layer decomposition stage by WLS filter, which make the normalized result less affected by the high contrast light and shadow effects of the input face. Through these two procedures, we can change the illumination effects of a non-normal lighting face by first normalizing the illumination and then transferring the illumination of another reference face to it. We acquire convincing relit results of both face relighting and delighting on numerous input and reference face images with various illumination effects and genders. Comparisons with previous papers show that our framework is less affected by geometry differences and can preserve better the identification structure and skin color of the input face.
Collapse
|
39
|
Allred SR, Brainard DH. A Bayesian model of lightness perception that incorporates spatial variation in the illumination. J Vis 2013; 13:18. [PMID: 23814073 PMCID: PMC3697904 DOI: 10.1167/13.7.18] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2012] [Accepted: 03/19/2013] [Indexed: 11/24/2022] Open
Abstract
The lightness of a test stimulus depends in a complex manner on the context in which it is viewed. To predict lightness, it is necessary to leverage measurements of a feasible number of contextual configurations into predictions for a wider range of configurations. Here we pursue this goal, using the idea that lightness results from the visual system's attempt to provide stable information about object surface reflectance. We develop a Bayesian algorithm that estimates both illumination and reflectance from image luminance, and link perceived lightness to the algorithm's estimates of surface reflectance. The algorithm resolves ambiguity in the image through the application of priors that specify what illumination and surface reflectances are likely to occur in viewed scenes. The prior distributions were chosen to allow spatial variation in both illumination and surface reflectance. To evaluate our model, we compared its predictions to a data set of judgments of perceived lightness of test patches embedded in achromatic checkerboards (Allred, Radonjić, Gilchrist, & Brainard, 2012). The checkerboard stimuli incorporated the large variation in luminance that is a pervasive feature of natural scenes. In addition, the luminance profile of the checks both near to and remote from the central test patches was systematically manipulated. The manipulations provided a simplified version of spatial variation in illumination. The model can account for effects of overall changes in image luminance and the dependence of such changes on spatial location as well as some but not all of the more detailed features of the data.
Collapse
Affiliation(s)
- Sarah R. Allred
- Department of Psychology, Rutgers, The State University of New Jersey, Camden, NJ, USA
| | - David H. Brainard
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
40
|
Shen J, Yang X, Li X, Jia Y. Intrinsic Image Decomposition Using Optimization and User Scribbles. IEEE TRANSACTIONS ON CYBERNETICS 2013; 43:425-436. [PMID: 22907970 DOI: 10.1109/tsmcb.2012.2208744] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
In this paper, we present a novel high-quality intrinsic image recovery approach using optimization and user scribbles. Our approach is based on the assumption of color characteristics in a local window in natural images. Our method adopts a premise that neighboring pixels in a local window having similar intensity values should have similar reflectance values. Thus, the intrinsic image decomposition is formulated by minimizing an energy function with the addition of a weighting constraint to the local image properties. In order to improve the intrinsic image decomposition results, we further specify local constraint cues by integrating the user strokes in our energy formulation, including constant-reflectance, constant-illumination, and fixed-illumination brushes. Our experimental results demonstrate that the proposed approach achieves a better recovery result of intrinsic reflectance and illumination components than the previous approaches.
Collapse
|
41
|
Laffont PY, Bousseau A, Drettakis G. Rich Intrinsic Image Decomposition of Outdoor Scenes from Multiple Views. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2013; 19:210-224. [PMID: 22508899 DOI: 10.1109/tvcg.2012.112] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Intrinsic images aim at separating an image into its reflectance and illumination components to facilitate further analysis or manipulation. This separation is severely ill posed and the most successful methods rely on user indications or precise geometry to resolve the ambiguities inherent to this problem. In this paper, we propose a method to estimate intrinsic images from multiple views of an outdoor scene without the need for precise geometry and with a few manual steps to calibrate the input. We use multiview stereo to automatically reconstruct a 3D point cloud of the scene. Although this point cloud is sparse and incomplete, we show that it provides the necessary information to compute plausible sky and indirect illumination at each 3D point. We then introduce an optimization method to estimate sun visibility over the point cloud. This algorithm compensates for the lack of accurate geometry and allows the extraction of precise shadows in the final image. We finally propagate the information computed over the sparse point cloud to every pixel in the photograph using image-guided propagation. Our propagation not only separates reflectance from illumination, but also decomposes the illumination into a sun, sky, and indirect layer. This rich decomposition allows novel image manipulations as demonstrated by our results.
Collapse
|
42
|
Liu X, Jiang L, Wong TT, Fu CW. Statistical Invariance for Texture Synthesis. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2012; 18:1836-1848. [PMID: 22392711 DOI: 10.1109/tvcg.2012.75] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Estimating illumination and deformation fields on textures is essential for both analysis and application purposes. Traditional methods for such estimation usually require complicated and sometimes labor-intensive processing. In this paper, we propose a new perspective for this problem and suggest a novel statistical approach which is much simpler and more efficient. Our experiments show that many textures in daily life are statistically invariant in terms of colors and gradients. Variations of such statistics can be assumed to be influenced by illumination and deformation. This implies that we can inversely estimate the spatially varying illumination and deformation according to the variation of the texture statistics. This enables us to decompose a texture photo into an illumination field, a deformation field, and an implicit texture which are illumination- and deformation-free, within a short period of time, and with minimal user input. By processing and recombining these components, a variety of synthesis effects, such as exemplar preparation, texture replacement, surface relighting, as well as geometry modification, can be well achieved. Finally, convincing results are shown to demonstrate the effectiveness of the proposed method.
Collapse
Affiliation(s)
- Xiaopei Liu
- School of Computer Engineering, Nanyang Technological University, Block N4-B1b-13, North Spine, 50 Nanyang Avenue, Singapore 639798.
| | | | | | | |
Collapse
|
43
|
|
44
|
Yang Q, Tan KH, Ahuja N. Shadow removal using bilateral filtering. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2012; 21:4361-4368. [PMID: 22829402 DOI: 10.1109/tip.2012.2208976] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
In this paper, we propose a simple but effective shadow removal method using a single input image. We first derive a 2-D intrinsic image from a single RGB camera image based solely on colors, particularly chromaticity. We next present a method to recover a 3-D intrinsic image based on bilateral filtering and the 2-D intrinsic image. The luminance contrast in regions with similar surface reflectance due to geometry and illumination variances is effectively reduced in the derived 3-D intrinsic image, while the contrast in regions with different surface reflectance is preserved. However, the intrinsic image contains incorrect luminance values. To obtain the correct luminance, we decompose the input RGB image and the intrinsic image. Each image is decomposed into a base layer and a detail layer. We obtain a shadow-free image by combining the base layer from the input RGB image and the detail layer from the intrinsic image such that the details of the intrinsic image are transferred to the input RGB image from which the correct luminance values can be obtained. Unlike previous methods, the presented technique is fully automatic and does not require shadow detection.
Collapse
Affiliation(s)
- Qingxiong Yang
- Department of Computer Science, City University of Hong Kong, Hong Kong.
| | | | | |
Collapse
|
45
|
Isaza C, Salas J, Raducanu B. Evaluation of intrinsic image algorithms to detect the shadows cast by static objects outdoors. SENSORS 2012. [PMID: 23201998 DOI: 10.3390/s121013333] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In some automatic scene analysis applications, the presence of shadows becomes a nuisance that is necessary to deal with. As a consequence, a preliminary stage in many computer vision algorithms is to attenuate their effect. In this paper, we focus our attention on the detection of shadows cast by static objects outdoors, as the scene is viewed for extended periods of time (days, weeks) from a fixed camera and considering daylight intervals where the main source of light is the sun. In this context, we report two contributions. First, we introduce the use of synthetic images for which ground truth can be generated automatically, avoiding the tedious effort of manual annotation. Secondly, we report a novel application of the intrinsic image concept to the automatic detection of shadows cast by static objects in outdoors. We make both a quantitative and a qualitative evaluation of several algorithms based on this image representation. For the quantitative evaluation, we used the synthetic data set, while for the qualitative evaluation we used both data sets. Our experimental results show that the evaluated methods can partially solve the problem of shadow detection.
Collapse
Affiliation(s)
- Cesar Isaza
- CICATA Qro, Instituto Politécnico Nacional, Cerro Blanco 141, Col. Colinas del Cimatario, Santiago de Queretaro CP 76090, Mexico.
| | | | | |
Collapse
|
46
|
Zhao Q, Tan P, Dai Q, Shen L, Wu E, Lin S. A Closed-Form Solution to Retinex with Nonlocal Texture Constraints. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2012; 34:1437-1444. [PMID: 22450820 DOI: 10.1109/tpami.2012.77] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
We propose a method for intrinsic image decomposition based on retinex theory and texture analysis. While most previous methods approach this problem by analyzing local gradient properties, our technique additionally identifies distant pixels with the same reflectance through texture analysis, and uses these nonlocal reflectance constraints to significantly reduce ambiguity in decomposition. We formulate the decomposition problem as the minimization of a quadratic function which incorporates both the retinex constraint and our nonlocal texture constraint. This optimization can be solved in closed form with the standard conjugate gradient algorithm. Extensive experimentation with comparisons to previous techniques validate our method in terms of both decomposition accuracy and runtime efficiency.
Collapse
Affiliation(s)
- Qi Zhao
- Department of Electrical and Computer Engineering, National University of Singapore, Singapore 117576
| | | | | | | | | | | |
Collapse
|
47
|
|
48
|
The efficacy of local luminance amplitude in disambiguating the origin of luminance signals depends on carrier frequency: Further evidence for the active role of second-order vision in layer decomposition. Vision Res 2011; 51:496-507. [DOI: 10.1016/j.visres.2011.01.008] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2010] [Revised: 01/16/2011] [Accepted: 01/19/2011] [Indexed: 11/24/2022]
|
49
|
|
50
|
|