Image Interpolation and Image Super-Resolution

Chapter 1 Introduction

1.1. Image Interpolation and Image Super-Resolution

Image interpolation and image super-resolution (SR) are algorithms [1-73] which aim to enlarge a small digital image to a large one. Let us call image interpolation and image super-resolution as image up-sampling when we talk about both scenarios. The objective of image up-sampling is to reconstruct a high resolution (HR) image from one or several low resolution (LR) images, while minimizing visual artifacts or minimizing the difference with the original HR image. A pixel in the input LR image is related to multiple pixels in the resultant HR image depending on the upscaling factor p. Thus, image up-sampling is an ill-posed underdetermined inverse problem. There is infinite number of HR image solutions to an input LR image. With 1/p²of the original information (p is the magnification factor), it is difficult to recover the original HR image exactly. The main reason is that the pixel predictions made by image up-sampling algorithms are usually the mean prediction results which may have high variances caused by ambiguity within the LR images.

1.1.1. APPLICATIONS

Despite the difficulty of image up-sampling problem, it has practical significance to break the inherent LR imaging of the devices and to better utilize the growing capability of HR display. The applications of image up-sampling algorithms include face recognition, surveillance system, medical imaging, High Definition Television (HDTV), image coding, image resizing/manipulation, and image compression, etc. For example, in a surveillance system, the captured human faces are usually very small (for instance, 20

pixels × 10 pixels). It is difficult for human beings or for face recognition algorithms to recognize the captured faces. Image up-sampling algorithms can enlarge the interested faces into a larger scale which would be more convenient for human to identify and improve the recognition rate of the face recognition algorithms.

1.1.2. DEGRADATION MODEL

In image interpolation and image SR problem, the vectorized observed LR image patch y can be represented as:

, n DHx

y  (1-1)

where D is the down-sampling operator, H is the blurring operator, x is the vectorized HR image patch, and n is the additive noise.

1.1.2.1. IMAGE INTERPOLATION DEGRADATION MODEL

For image interpolation problem, there is no anti-aliasing pre-filtering during degradation and the pixels on LR image are directly down-sampled from the HR image, as shown in Fig. 1-1. Thus, the blurring operator H is an identity matrix, the down-Fig. 1-1: Illustration for image interpolation down-sampling: the black pixels are the ground-truth pixels in LR image obtained from the original HR image, while the white pixels are the unknown HR pixels which are to be interpolated.

sampling matrix D is a sparse matrix which has very few elements with value 1 and most assumption, D is the down-sampling operator, and x is the vectorized HR image patch.

An example of the down-sampling operator D when the down-sampling factor is 2 and the HR image patch size is 4 by 4 is:

For image SR problem, the blurring operator H is not a Dirac delta function (identity matrix) as in the image interpolation problem. Most of the image SR algorithms [1-16, 18-36, 38] assume that the blurring operator H is a Gaussian blur kernel or a bi-cubic filter kernel. There are also some research works named as blind image super-resolution [74-77] which treat the blurring operator unknown and estimate the blurring operator from cues within the input image itself.

Despite the blind image SR algorithms, there are 3 typical single image SR degradation scenarios, all responding to a zooming deblurring setup with a known blur kernel and zero noise.

(1) A bi-cubic filter followed by down-sampling by a scale of 2.

(2) A bi-cubic filter followed by down-sampling by a scale of 3.

(3) A Gaussian filter of size 7×7 with standard deviation of 1.6 followed by down-sampling by a scale of 3.

1.1.2.2. TRANSFORM DOMAIN IMAGE UP-SAMPLING DEGRADATION MODEL

In image/ video coding, images are divided into non-overlapping blocks which are converted into transform domain (usually Discrete Cosine Transform (DCT) domain). In a DCT block of size k× k, there are one DCT coefficient and k²-1 AC coefficients. A HR DCT block is down-sampled to a LR DCT block by only retaining the low frequency parts and removing high frequency parts. The obtained LR DCT block is then rescaled and converted back to spatial domain by inverse Discrete Cosine Transform (IDCT).

1.1.3. DIFFERENCE BETWEEN IMAGE INTERPOLATION AND IMAGE SUPER

-RESOLUTION

As the image interpolation problem and the image SR problem have different assumptions (degradation models), their algorithms are different in principle.

In image interpolation problem, the pixels on the obtained LR image are directly down-sampled from the original HR image. The observed LR image is very sharp, however, is also highly aliased since there is no anti-aliasing pre-filtering (blur). Thus, we estimate the missing HR pixels based on some ground-truth pixels. The main problem for image interpolation is to insert pixels on the HR image grid between the known pixels and resolve the strong aliasing problem, if any. The missing HR pixels can be roughly predicted based on the statistics of the observed LR pixels.

While in image SR problem, we don’t have ground-truth pixels due to the existence of blurring operator which is not an identity matrix. The initially bi-cubic interpolated image is much blurry than the original HR image. For this reason, image SR problem can

be referred to zooming deblurring. “Zooming” means that the LR image should be enlarged and “deblurring” means that the image SR algorithm should be able to restore the sharpness of the super-resolved image.