|
Research
Abstracts - 2007 |
Face Hallucination: Theory and PracticeCe Liu*, Heung-Yeung Shum † & William T. Freeman**Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology †Microsoft Research Asia To appear in International Journal of Comptuer Vision (IJCV). [pdf] [SpringerLink] An earlier version of this work was published at CVPR 2001. [pdf] [ppt]
WhatIn this paper, we study face hallucination, or synthesizing a high-resolution face image from an input low-resolution image, with the help of a large collection of other high-resolution face images. Our theoretical contribution is a two-step statistical modeling approach that integrates both a global parametric model and a local nonparametric model. Our practical contribution is a robust warping algorithm to align the low-resolution face images to obtain good hallucination results. The effectiveness of our approach is demonstrated by extensive experiments with high-quality hallucinated face images with no manual alignment. WhyMany computer vision tasks require inferring the missing high-resolution image from the low-resolution input. Of particular interest is to infer high-resolution (abbr. high-res) face images from low-resolution (abbr. low-res) ones. This problem was introduced by Baker and Kanade [1] as face hallucination. This technique has broad applications in image enhancement, image compression and face recognition. It can be especially useful in a surveillance system where the resolution of face image are normally low in videos, but the details of facial features which can be found in the potential high-res image are crucial for identification and further analysis. HowWe propose that a successful face hallucination algorithm should meet the following three constraints
Such global and local constraints motivate us to design a hybrid approach in this paper. We combine a global parametric model which generalizes well with common faces, with a local nonparametric model which learns local textures from example faces. We incorporate all the constraints in a statistical face model and find the maximum a posteriori (MAP) solution to the hallucinated face. The data constraint is modeled as a Gaussian distribution (a soft constraint), or simply as an equality constraint (hard constraint). The global constraint assumes a Gaussian distribution learned by principal component analysis (PCA). The local constraint utilizes a patch-based nonparametric Markov network to learn the statistical relationship between the global face image and the local features. A two-step approach is then used in hallucinating faces. First, an optimal global face image is pursued in the eigen-space when constraints (a) and (b) are satisfied. Second, an optimal local feature image is inferred from the optimal global image by minimizing the energy of the Markov network with constraint (c) applied. The sum of the global and local image forms the final result. An example of hallucinated image from an input low-resolution image is shown in Figure 1. Although the facial feature details of the hallucinated face are different from those in the original, we may perceive it as a valid human face taken by a camera. At a practical matter, the other challenge in face hallucination is the difficulty of aligning faces at low-res images. Many learning-based image synthesis models require alignment between the test sample and the training examples, e.g. [2]. Even a small amount of misalignment can dramatically degrade the synthesized result. However, the facial features may contain very few pixels; in real images the faces are normally not upright; the scale and position must be estimated at sub-pixel level. Therefore, alignment at low-res requires that very accurate measurements be made from very little data. To address this challenge, we design a face alignment algorithm to align faces at low-res. The alignment algorithm finds an affine transform to warp the input image to a template to maximize the probability of low-res face image, determined from an eigenspace representation. To make that alignment step robust, multiple candidate starting points are explored through a stochastic algorithm from which the best alignment result is selected automatically. Experimental ResultsWe use CMU face database [4] and some other images to test the face hallucination system. We first run the system on a number of images and the result for a collection of test images is shown in Figure 2. The pairs of low-res (32x24) and high-res (128x96) are displayed on the right or bottom to the original image from which the low-res faces are detected, registered and extracted. Clearly, our system is able to hallucinate the details of facial features, particularly eyes, eyebrows, mouth and noise though they are not visible in the low-res.
What if you have forgotten some faces of your teammates yet the old photo is small and blurred? Our face hallucination system may be able to help, as shown below. AcknowledgementThe authors appreciate the help from Lin Liang of MSRA for aligning the training faces and running face detector for the test images. Ce Liu would like to thank Edward Adelson, Antonio Torralba and Bryan Russell for the insightful discussions. Heung-Yeung Shum thanks Takeo Kanade for helpful discussion on face hallucination and computer vision. Ce Liu is supported by Microsoft Fellowship. References[1] S.Baker and T. Kanade. Hallucinating faces. In IEEE International Conference on Automatic Face and Gesture Recognition, March 2000. [2] H.Chen, Y.Q. Xu, H.Y. Shum, S.C. Zhu, and N.N. Zheng. Example-based facial sketch generation with non-parametric sampling. In Proc. IEEE Int'l Conf. Computer Vision, pages 433-438, 2001 [3] C.Liu, H. Y. Shum, and C. S. Zhang. A two-step approach to hallucinating faces: Global parametric model and local nonparametric model. Proc. IEEE Conf. Computer Vision and Pattern Recognition, pages 192-198, 2001. [pdf] [ppt] (presented by Harry Shum at CVPR) [4] H. A. Rowley, S. Baluja, and T. Kanade. Neural network-based face detection.IEEE Trans. on Pattern Analysis and Machine Intelligence, 20(1):23-38, 1998. |
||||||
|