CSAIL Research Abstract

Introduction

Architecture, Systems
& Networks

Language, Learning,
Vision & Graphics

Physical, Biological
& Social Systems

Theory

horizontal line

Compressing and Companding High Dynamic Range Images with Subband Architectures

Edward H. Adelson, Yuanzhen Li & Lavanya Sharan

park

Figure 1. The first three images represent a high dynamic range scene at three different exposures. Hard to display. The fourth image is the range compressed image using our technique. HDR image source: Cornell CS.

What

We propose a dynamic range compression technique using subband architectures and local gain control. We also adapt this technique for a related problem of "companding", in which the dynamic range of an image is compressed and then expanded.

Why

Natural scenes contain huge ranges of luminance, and it has recently become convenient to capture high dynamic range (HDR) images digitally. Unfortunately there is no easy way to display them, as most display devices still have limited dynamic range. The classical way of handling dynamic range in photographs is through a point non-linearity, such as a log or power function. However, this usually forces a loss of contrast. Various techniques have been proposed for compressing the dynamic range while retaining important contrast. Mulstiscale image processing techniques, which are widely used for many image processing tasks, have a reputation of causing halo artifacts when used for range compression. However, we demonstrate that they can work without introducing unpleasant artifacts.

Given that we can compress the range of an HDR image into an LDR (low dynamic range) image, it would also be desirable if we could invert the process, i.e., retrieve an HDR image from an LDR image, with minimal degradation. Suppose, for instance, that we have compressed a 12-bit image into an 8-bit image. Can we retrieve a good 12-bit image? Clearly we cannot do it perfectly, but perhaps we can get a good approximation. We refer to the range compression/expansion process as "companding", in accordance with audio terminology. One application is in driving HDR displays. Although HDR displays are being developed, most software applications and video cards today only handle 8 bit images. It will be very useful if our laptop can output an 8 bit image and have it magically converted into a clean 12 bit image on a specialized display. Another application is HDR image storage and transmission. After we turn a 12 bit image into an 8 bit one, the image can be stored in a standard lossless 8 bit format, or can be further compressed with a lossy format such as JPEG.

How

An image is split into subbands which are tuned for different spatial frequencies, and local gain control (i.e., contrast normalization [2]) is applied to each subband. The range compression result is then resynthesized from the modified subbands. We tried various subband architectures, and found that analysis-synthesis architectures, such as QMFs[1], Haars, Laplacian pyramids, etc., work remarkably well when smooth gain maps are computed, allowing huge compression while keeping detail in both bright and dark regions and avoiding unnatural artifacts.

Companding can be thought of as an encoding-decoding process. We first establish a standard method for doing range expansion, and then create an "encoded" image that will yield the desired image when it is decoded. We find this "encoded" image through iterations, but the decoding (expansion) is a one-shot multiscale procedure, which is very similar to the range compression procedure mentioned above.

baby

Figure 2. Baby companding. An 8 bit per channel image (left), compressed into a 3 bit per channel image (middle). The middle can be expanded into an 8 bit image (right), which is very similar to the original image on the left.

lamp

Figure 3. Lamp companding. Leftmost: the range compression result (8 bit per channel). The retrieved 12 bit HDR image is shown at three different exposures, as shown in the three images on the right. HDR image source: Spheron.

Research Support

Supported by a MURI grant via the Office of Naval Research (N00014-01-1-0625), NSF grant BCS_0345805, and contrasts from NTT Labs, Shell Labs, and the National Geospatial Intelligence Agency (NGA).

References

[1] Edward H. Adelson, Eero Simoncelli, R. Hingorani. Orthogonal pyramid transforms for image coding. In Visual Communications and Image Processing II, Proc. SPIE, vol. 845, 50–58. 1987.

[2] David Heeger. Half-squaring in responses of cat simple cells. Visual Neurosci. 9, 427–443. 1990.

[3] Yuanzhen Li, Lavanya Sharan, Edward H. Adelson. Compressing and Companding High Dynamic Range Images with Subband Architectures. To appear in SIGGRAPH 2005.

Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu
(Note: On July 1, 2003, the AI Lab and LCS merged to form CSAIL.)