other challenges, facerecognition from lowresolution data video reduces the performance of the existing systems significantly.
The goal of this thesis is to develop a system that uses SR as an intermediate step for facerecognition in lowresolutionvideo and to analyze the facerecognition rate improvements. Multi-frame SR is a practical solution that may increase frame resolution, which could in turn improve facerecognition rates. The approach of this thesis uses a set of mutually unregistered lowresolutionface frames captured from video to construct a new frame which is higher in resolution (e.g., see Figure 1). In practice, such a combination of information from multiple images is not trivial. There are two main problems that need to be solved in a superresolution algorithm. First, all the input images need to be correctly aligned with each other on a common grid. Next, an accurate, sharp image has to be reconstructed from the gathered information. If one of these two steps is not done well, the resulting image is not acceptable, and no gain in resolution could be obtained.
We evaluate the performance of facerecognition algorithms on images at vari- ous resolutions. Then we show to what extent super-resolution (SR) methods can improve the recognition performance when comparing low-resolution (LR) to high-resolution (HR) facial images. Our experiments use both synthetic data (from the FRGC v1.0 database) and surveillance images (from the SCface data- base). Three facerecognition methods are used, namely Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and Local Binary Patterns (LBP). Two SR methods are evaluated. The first method learns the mapping between LR images and the corresponding HR images using a regression model. As a result, the reconstructed SR images are close to the HR images that belong to the same subject and far away from others. The second method compares LR and HR facial images without explicitly constructing SR images. It finds a coherent feature space where the correlation of LR and HR is maximum, and then compute the mapping from LR to HR in this feature space. The perform- ance of the two SR methods are compared to that delivered by the standard facerecognition without SR. The results show that LDA is mostly robust to resol- ution changes while LBP is not suitable for the recognition of LR images. SR methods improve the recognition accuracy when downsampled images are used and the first method provides better results than the second one. However, the improvement for realistic LR surveillance images remains limited.
We evaluate the performance of facerecognitionusing images with differ- ent resolution. The experiments are conducted on FaceRecognition Grand Challenge version one (FRGC v1.0) database and Surveillance Cameras Face (SCface) Database. Three recognition methods are used, namely Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and Local Binary Pattern (LBP). To improve the performance of face images with low- resolution (LR), two state-of-art super-resolution (SR) methods are applied. One is called Discriminative Super-resolution (DSR). It finds the relation- ship from low-resolution images to their corresponding high-resolution (HR) images so that the reconstructed super-resolution images would be close to the HR images which belongs to the same subject with them and far away from others. The other SR method uses Nonlinear Mappings on Coherent Features (NMCF). Canonical Correlation analysis is applied to compute the coherent features between the PCA features of HR and LR images. Then Radial Basis Functions (RBFs) is used to find the mapping from LR fea- tures to HR features in the coherent feature space. The two SR methods are compared on both FRGC and SCface databases as well.
Obtaining a higher PSNR does not necessarily contribute to a higher recognition rate since high fidelity reconstruction of low- frequency content may dominate the image. Facerecognition degrades when probe faces are of significantly lower resolution than those in the gallery. We used superresolution methods in the spatial domain as proposed by  and the standard bi-cubic interpolation method to reconstruct a higher resolution version of the low-resolution probe and then perform matching in the usual way at higher resolution. Table 3 shows the results of identification accuracy for different illumination subsets of the Extended Yale B database. We used adaptive histogram equalization (AHE), developed in , to normalize variations in illumination of the super resolved images. The experimental results are mixed. While in some cases there is no significant difference between the SR methods (see results for set 2), in set 1 superresolution by dictionary method as pre-processing result in better accuracy of matching in the LL 3 subband. Rather surprisingly, when LH 3 subband is used for matching images in sets 3 and 4 and to some degree in set5, Bi-cubic interpolation has increased recognition accuracy as much as—if not more—by that achieved with the more complex SR methods. This could be due to the presence blocky artifacts in the non-interpolation based superresolution methods. Which result in degrading the face feature vectors specially in badly lit images.
Superresolution  is the technique for converting lowresolution images into high resolution faces. The SR task is cast as the inverse problem of recovering the original high- resolution image by fusing the low-resolution images, based on reasonable assumptions or prior knowledge about the observation model that converts the high resolution image to the low-resolution ones. The fundamental reconstruction constraints for SR is that recovered image, after applying the same generation model should reproduce the observed lowresolution image. SR algorithms can be categorized into four classes. Interpolation-based algorithms register lowresolution images (LRIs) with the high resolution image (HRI), and then apply non-uniform interpolation to produce an improved resolution image which is further deblurred. Frequency based algorithms try to dealias the LRIs by utilizing the phase difference among the LRIs.
Typically, there are two ways for low- resolutionfacerecognition. The hallucination category aims to reconstruct high-resolution faces before recognition, while the embedding category proposes extracting features directly from low- resolution faces via the embedding schema. In the hallucination category, Kolouri et al. constructed a nonlinear Lagrangian model of high- resolution facial appearance and then found the model parameters that best ﬁt the low-resolution faces. Jian et al. Proposed a framework based on singular value decomposition and performed face hallucination and recognition simultaneously. In a joint face hallucination and recognition framework was proposed based on sparse representation. This framework can synthesize person-speciﬁc low- resolution faces for recognition. In a system was proposed to recognize faces by using sparse representation with the speciﬁc dictionary involving many natural and facial images. Moreover, deep models like and can generate extremely realistic high-resolution images from low-resolution faces. However, the speed of such hallucination or superresolution based approaches may be a little slow due to the complex high- resolutionface reconstruction process, which hinders their direct deployment in real-world scenarios with limited computational resources.
(Equation 3.15), but uses a multi-resolution AAM where the effective resolution of the model is chosen based on the resolution of the input, hence avoiding significant interpolation of the input image during fitting. The training data is first down-sampled to lower resolutions at different scales. An AAM is then trained at each resolution. The landmarks at lower resolutions are obtained by scaling the landmarks from the HR images. Thus, the mean shapes of the multi-resolution AAM differ only by a scaling factor while the shape basis vectors are exactly the same across different resolutions. During fitting, an appropriate model resolution is chosen based on the resolution of the input image. The authors compared the fitting convergence of models at different resolutions and concluded th a t the best performance is obtained when the resolution of the model is only slightly higher than the input image. Furthermore, a face tracking experiment using a person-specific AAM was performed and the results showed th at when fitting to an LR image, using an AAM with a resolution close to the input yields better performance compared to using a high-resolution AAM. The authors did not apply their method to facerecognition in [ 68 ]. Hence, it is not clear whether the parameters estimates obtained with this method are robust enough for recognition. However, the same authors used this approach in a model-assisted framework for super resolving facial texture as a pre-processing step for facerecognition . An image formation model similar to Equation 3.1 was used where the registration was performed by fitting a multi-resolution AAM on the LR inputs. The authors then used an ap proach similar to  for super-resolution where the super-resolution criterion function is an LI
Image super-resolution is a classical problem in the domain of computer vision. It aims to infer an HR image with crucial information from the given LR images. Face hallucination is a branch of image super-resolution, which develops do- main specific prior knowledge with strong cohesion to face domain. It was first introduced by Baker and Kanada  and has attracted growing attention due to practical importance in many face based applications such as facerecognition, face alignment and so on. As the development of machine learning, there are numerous learning-based methods which have been proposed to solve the face hallucination problem. Learning based algorithms have been seen to achieve higher magnification factor with better visual quality than the other super reso- How to cite this paper: Xia, J.F., Yang,
a b s t r a c t
In this paper a facesuper-resolution method using two-dimensional canonical correlation analysis (2D CCA) is presented. A detail compensation step is followed to add high- frequency components to the reconstructed high-resolutionface. Unlike most of the previous researches on facesuper-resolution algorithms that first transform the images into vectors, in our approach the relationship between the high-resolution and the low- resolutionface image are maintained in their original 2D representation. In addition, rather than approximating the entire face, different parts of a face image are super- resolved separately to better preserve the local structure. The proposed method is compared with various state-of-the-art super-resolution algorithms using multiple eva- luation criteria including facerecognition performance. Results on publicly available datasets show that the proposed method super-resolves high quality face images which are very close to the ground-truth and performance gain is not dataset dependent. The method is very efficient in both the training and testing phases compared to the other approaches.
Figure 1. Example sub-pixel displacements.
2. IMAGE SUPER-RESOLUTION THEORY
The ﬁeld of image super-resolution arose from the need to overcome the physical limitations of low-resolution (LR) imaging systems to generate higher-resolution images than would be otherwise possible with the available hardware. For example, in surveillance applications, single video frames are relatively low in image detail and not well-suited to tasks such as facerecognition. Similarly, even the high-end optics on imaging satellites are not always suﬃcient to distinguish important scene features. Fortunately, when a moderate amount of scene motion exists between frames, the data in these low-resolution images can be fused to yield an image of higher resolution than any one of the frames. A variety of approaches can be found in the literature for exploiting this scenario, and a comprehensive overview of the state-of-the art in image SR can be found in, 1 to which we refer the interested reader. For the purpose of a tutorial for the ATR community, a brief overview of 1 is presented in this section.
Depth facial data may also benefit from the SR framework. Recently, Berretti et al. proposed to use SR on facial depth images once back-projected in 3-D, and defined the super- faces approach . The SR algorithm they deployed is sim- ilar in principle to the initial blurred estimate provided in the enhanced Shift & Add algorithm proposed by Al Is- maeil et al. in . Later on, this work was extended to the dynamic case where the considered multiple realizations were ordered frames constituting a video sequence . This approach is referred to as Upsampling for Precise Super- Resolution (UP-SR). Its key component is a prior upsam- pling of the observed data which is proven to enhance the registration of frames over time. In addition, it uses a bi- lateral total variation framework as a smoothness condition. In , a similar concept of temporal fusion was considered for 3-D facial data enhancement. However, the increase in resolution was induced from temporal data cumulation without a real SR formulation or upsampling. Moreover, smoothness was ensured by bilateral filtering as a post pro- cessing operation and not included in the optimization ob- jective function.
Based on these observations, we propose an approach for facerecognition in real surveillance environment. In this paper we focus on the indoor surveillance environment, e.g., in a corridor where people’s motions are generally walking in a single direction in a relatively slow and steady pace. Our focus is hence on facerecognition on surveillance captured face images with low resolutions, varied illumina- tion conditions, small pose variation, and slow motions. Due to the very lowresolution of the captured face images, many face features are lost. Image pre-processing ideas are employed to remove illumination variations as much as possible. In order to accumulate more features, we fuse a video sequence into one frame in the frequency domain. Curvelet features are adopted in the fusion process. The image is further improved through image super-resolution methods in order to increase the image resolution. Experi- mental results demonstrate that the proposed approach is able to improve the facerecognition performance.
Super-resolution (SR) represents a class of signal processing methods allowing to create a high resolution image (HR) from several lowresolution images (LR) of the same scene. Therefore, high spatial frequency information can be recovered. Applications may include but are not limited to HDTV, biological imaging, surveillance, forensic investigation. In this work, a survey of SR methods is provided with focus on the non-uniform interpolation SR approach because of its lower computational demand. Based on this survey eight SR reconstruction algorithms were implemented. Performance of these algorithms was evaluated by means of objective image quality criteria PSNR, MSSIM and computational complexity to determine the most suitable algorithm for real video applications. The algorithm should be reasonably computationally efficient to process a large number of color images and achieve good image quality for input videos with various characteristics. This algorithm has been successfully applied and its performance illustrated on examples of real videosequences from different domains.
Degradation Models: Accurate degradation (observa-
tion) models promise improved SR reconstructions. Sev- eral SR application areas may benefit from improved degra- dation modeling. Only recently has color SR reconstruc- tion been addressed . Improved motion estimates and reconstructions are possible by utilizing correlated infor- mation in color bands. Degradation models for lossy compression schemes (color subsampling and quantization effects) promise improved reconstruction of compressed video. Similarly, considering degradations inherent in mag- netic media recording and playback are expected to improve SR reconstructions from low cost camcorder data. The re- sponse of typical commercial CCD arrays departs consider- ably from the simple integrate and sample model prevalent in much of the literature. Modeling of sensor geometry, spatio-temporal integration characteristics, noise and read- out effects promise more realistic observation models which are expected to result in SR reconstruction performance im- provements.
applications, such as access to top security domains, may even necessitate the forgoing of the nonintrusive quality of facerecognition by requiring the user to stand in front of a 3D scanner or an infra-red sensor. Therefore, depending on the face data acquisition methodology, facerecognition techniques can be broadly divided into three categories: methods that operate on intensity images, those that deal with videosequences, and those that require other sensory data such as 3D information or infra-red imagery. The following discussion sheds some light on the methods in each category and attempts to give an idea of some of the benefits and drawbacks of the schemes mentioned therein in general.
We address the dynamic super-resolution (SR) problem of reconstructing a high-quality set of monochromatic or color super- resolved images from low-quality monochromatic, color, or mosaiced frames. Our approach includes a joint method for simul- taneous SR, deblurring, and demosaicing, this way taking into account practical color measurements encountered in video se- quences. For the case of translational motion and common space-invariant blur, the proposed method is based on a very fast and memory eﬃcient approximation of the Kalman filter (KF). Experimental results on both simulated and real data are supplied, demonstrating the presented algorithms, and their strength.
One of the most common tools used in image processing, especially in resolution enhancement techniques, is the wavelet transform [13-17]. A 1-level discrete wavelet transform (DWT) of a video sequence’s frame produces a low-frequency subband known as low-low (LL), and 3 high-frequency subbands, low-high (LH), high-low (HL), and high-high (HH), oriented at horizontal (0 ◦ ), diagonal (45 ◦ ), and vertical (90 ◦ ) angles . In this paper, a video superresolution method is proposed. This resolution enhancement technique uses DWT in order to decompose low-resolution input frames. The LH, HL, and HH subbands of the frames are superresolved using bicubic interpolation. At the same time, the input low-resolution frames are superresolved using the Irani and Peleg technique . Illumination inconsistence can be attributed to uncontrolled environ- ments. Because the Irani and Peleg registration technique is used, it is an advantage that the frames used in the registration process have the same illumination. In addition, in this paper, a new illumination compensation method using singular value decomposition (SVD) is proposed. The illumination compensation technique is ap- plied to the frames as the preprocessing stage, and then the Irani and Peleg resolution enhancement technique is implemented on the processed frames. Finally, inverse DWT (IDWT) is used to combine the interpolated high-frequency subbands, obtained from the DWT of the corresponding frames, and their respective super- resolved input frames to reconstruct a superresolved video sequence. For comparison purposes, the methods of Keren et al. , Lucchese and Cortelazzo , Marcel et al. , and Vandewalle et al.  were used for registration, followed by various reconstruction techniques such as the robust superresolution technique , bicubic interpolation, iterated back projection , and structure adaptive normalized convolution .
ABSTRACT: Image Inpainting is recover missing part of image as Image is save memories of life’s important moments. A Novel framework is one of the model of inpainting in which image inpainting work on scratchy version of image inpainting. Inpainting of the lowresolution image is simple than high quality image. It will display high and complex image. Using different image inpainting techniques create high quality of image from lowresolution image and collect the high level images. For this purpose our system uses the superresolution algorithm which is responsible for inpainting of single image.
________________________________________________________________________________________________________ Abstract - Inpainting aims to restore images with partly information loss and tries to make in-painting results as these missing parts in such a way that the reconstructed the video looks natural. The key issue in video completion is to maintain the spatial temporal information. A lot of researchers have worked in the area of video inpainting. Most of the researchers try to maintain either spatial regularity or temporal stability between the frames. But none of them try to maintain both of them in the identical technique with a good quality. There exist convoluted situation where low-quality input images suffer from inadequate resolution with missing regions. Treating superresolution and inpainting simultaneously decreases noise than superresolution after inpainting
Information in the frequency domain is useful in image classification. In , a global feature of a scene, named ‘‘spatial envelope’’, is proposed by exploring the dominant spatial structure of a scene. For this global feature, the global energy spectrum is used to develop spectral signatures for each scene category. To capture the textural characteristics of the image in the frequency domain, a variant of the global energy feature is presented further in , which explores the statistics of the co-occurrence matrix. Although the spectral feature is specially designed for scene classification, in this paper we present a spectral representation of face images and apply this representation to the one-sample-per- person problem. One issue with the one-sample-per-person problem is that the number of training sample available is too few. In this paper, multi-resolution spectral images are extracted and used as representations of training face images by means of a method similar to , thereby enlarging the size of the training set greatly. We find that, among these spectral feature images, features extracted from some specific orientations and scales using 2DLDA are not sensitive to variations of illumination and expression. Inspired by this finding, in our algorithm the spectral features are used as a robust representation of faces. As we do not know exactly which orientations and scales are robust for all testing images, an alternative approach is to use all of these filters in the decision-making process. In our method, each of the filters will form one weak classifier. The strategy of classifier committee learning (CCL) is designed further to combine the results obtained from different spectral feature images to determine the classes of the testing images. With the strategy of CCL, on the one hand, most of the correct categorizations can be retained. On the other hand, it is not necessary for us to choose the optimal filters, which is a very difficult task for the one-sample-per-person problem. Using the above strategies, the negative effects caused by those unfavorable factors, such as variations of illumination and facial expression, can be alleviated greatly in facerecognition. Exper- imental results on some standard databases demonstrate the feasibility and efficiency of the proposed method.