MIT CSAIL Research Abstracts

An image of a scene can be represented as the composition of a number of different images. At the image formation level, the image can be represented as the sum of a noise image and the true image of the scene. The scene image itself can also be represented as the composition of images that describe the characteristics of the surfaces in the scene. Two of the most fundamental characteristics of a scene are the shading and albedo of each point. These characteristics of the scene can be represented flexibly with images. Figure 1 shows an example of how the shading and albedo of the surface in (a) can be represented with one intrinsic image that represents the shading, shown in Figure 1(b), and second image representing the albedo of each point, shown in Figure 1(c). Our goal is to decompose a single image, such as Figure 1(a), into shading and reflectance images.

**Figure 1:** Example of shading and reflectance intrinsic images. (a) Image of a scene (b) The reflectance intrinsic image. This image contains only the reflectance of each point. (c) Shading intrinsic image. This image results from the interaction of the illumination of the scene and the shape of the surface.

Why

Every natural image is the combination of the characteristics of the scene in the image. Understanding the contents of an image requires some method of reasoning about these characteristics and how they affect the final image. One of the fundamental difficulties of vision is that these characteristics are mixed together in the observed image. Before any high-level tasks can be accomplished, the effects of these characteristics must be distinguished. For example, to recover the shape of the surface in Figure 1(a), the shading of the surface must be distinguished from the four squares that have been painted on the surface.

The intrinsic image decomposition facilitates analysis of the scene in an image by separating out the important intrinsic characteristics of the scene into distinct images. This is useful because images are a general representation of the scene which can easily be incorporated into higher-level analysis.

Progress

In our previous work [3], the shading and reflectance intrinsic images are found by classifying each image derivative as either being caused by shading or a reflectance change. While this system worked well on some real-world images, it suffers from two main limitations:

In our recent work [2], we have addressed the first problem by taking advantage of color to produce examples of real-world surfaces with known albedo. Figure 2(a) shows the red channel of an image taken of a piece of paper colored with a Crayola ``Electric Lime-Green'' Washable Marker and illuminated with a standard incandescent light. Both the markings and the surface shading is visible. The green channel of that image, shown in Figure 2(b), does not contain any of these markings. We use this image as the shading image.

**Figure 2:** These images are the red and green channel of a photograph of a piece of paper colored with a green maker. The coloring can be seen in (a) but not in (b). We use this property to construct a training set of images and the corresponding ground-truth shading and albedo images.

To address the second limitation in our previous work, the reliance on classification, we have developed a method that treats this problem as a continuous estimation problem instead. Instead of training a classifier to label each derivative in the image as either shading or a reflectance change, we instead learn to estimate the value of the derivatives. Compared to the standard, simple heuristics, this system performs much better. Figure 3 shows an example comparison between the intrinsic images estimated by our method and a method based on the same heuristic as Retinex [1]. Our method cleanly removes the writing from the shading image.

**Figure 3:** A comparison of the shading and albedo images estimated by our method versus the estimates produced by a system based on same heuristic as the Retinex algorithm [1]. Our method produces a much cleaner separation of the shading and albedo information.

Our method is also flexible enough to be applied to the problem of denoising images. On this problem, our method produced denoised images that are competitive with state-of-the-art denoising methods.

Research Support

This work was supported by an NDSEG fellowship to MFT, by a grant from the Nippon Telegraph and Telephone Corporation as part of the NTT/MIT Collaboration Agreement, and by a grant from Shell Oil.

References

1 E. H. Land and J. J. McCann. Lightness and retinex theory. Journal of the Optical Society of America, 61:1-11, 1971.

2 M. F. Tappen, E. H. Adelson, and W. T. Freeman. Estimating intrinsic component images using non-linear regression. In To appear in the 2006 IEEE Conference on Computer Vision and Pattern Recognition, 2006.

3 M. F. Tappen, W. T. Freeman, and E. H. Adelson. Recovering intrinsic images from a single image. In S. T. S. Becker and K. Obermayer, editors, Advances in Neural Information Processing Systems 15, pages 1343-1350. MIT Press, Cambridge, MA, 2003.

Estimating Intrinsic Images using Non-Linear Regression

Marshall F. Tappen, Edward H. Adelson & William T. Freeman

Basic Problem

Why

Progress

Research Support

References