Image Resizing - Image and Video Processing

5.1 Image Halving and Image Doubling in the Compressed Domain . . . 170

5.1.1 Using Linear, Distributive and Unitary Transform Properties . . 170

5.1.2 Using Convolution-Multiplication Properties . . . 172

5.1.2.1 Two-fold Downsampling of 8-point DCT Blocks in 1-D . . . 173

5.1.2.2 Twofold Upsampling of 8-point DCT Blocks in 1-D . . 175

5.1.2.3 Example in 2-D . . . 175

5.1.3 Using Subband DCT Approximation with Block Composition and Decomposition . . . 176

5.1.3.1 Image Halving . . . 177

5.1.3.2 Image Doubling . . . 179

5.1.4 Performance Analysis . . . 182

5.2 Resizing with Integral Factors . . . 184

5.2.1 L× M Downsampling Algorithm (LMDS) . . . 184

5.2.2 L× M upsampling Algorithm (LMUS) . . . 185

5.3 Resizing with Arbitrary Factors . . . 187

5.4 Hybrid Resizing . . . 191

5.4.1 Computational Cost . . . 192

5.5 Summary . . . 193

Image resizing is often required for accommodating images into a suitable format as demanded by different applications involving display, transmission, and storage of images. Display devices over a wide range of resolution are available nowadays. Camera resolution also varies widely, and there is a need for displaying images in a desired spatial resolution in agreement with display resolutions. Hence, resizing is a task that needs to be performed in such a scenario. This is true not only for images, but videos also need to be resized for the same purpose. However, in this chapter we restrict ourselves to the problem of image resizing. The problem of video resizing is more complex and is closely related to the more general problem of transcoding of images and videos. Hence, this is discussed in Chapter 6.

The problem of image resizing deals with conversion of an image of size M1× N¹ to that of a different size, M2× N². In particular, when M2= ^M₂¹ and N2= ^N₂¹, the operation is known as image halving. On the other hand, for M2= 2M1 and N2= 2N1, the process is referred to as image doubling. If both M2 and N2 are less than M1 and N1, respectively, the process is called downsampling. Again if they are greater than the other two, it is known as up-sampling. There are various interpolation techniques for performing this task in the spatial domain. In this chapter, we review some of the key approaches

[36, 60, 62, 94, 99, 108, 103, 106, 111, 123, 124, 132, 134, 136] to solving this problem in the block DCT space by exploiting different properties of DCT as discussed in Chapter 2. In the next section, we first consider the problem of image halving and image doubling. Subsequently, techniques to carry out more general resizing tasks are discussed.

5.1 Image Halving and Image Doubling in the Com-pressed Domain

In the early stage of development of various techniques in the transform do-main, problems of image halving and image doubling drew considerable atten-tion from various researchers. Later, some of these techniques were extended to the development of resizing algorithms with arbitrary factors. To under-stand the development of these concepts, we first review various approaches to image halving and image doubling in the following subsections.

5.1.1 Using Linear, Distributive and Unitary Transform Properties

Several approaches use the linear, distributive, and unitary transform prop-erties of the DCT for resizing images in this domain. For example, in [123] a simple algorithm of image halving is reported that exploits those properties.

In this approach, adjacent four 8×8 blocks are converted into one block, which contains averages of 2 × 2 sub-blocks of pixels from each of them.

Let xij, 0 ≤ i, j ≤ 1, denote these adjacent four blocks in the spatial domain (as shown inFigure 5.1). The downsampled block xdis generated from these blocks according to the following Eq. (5.1).

xd= Σ¹_j=0Σ¹_i=0pixijp^T_j, (5.1) where,

p0=

D_4×8 0_4×8

, p1=

0_4×8 D_4×8

. (5.2)

In the above equations, 0_4×8 is a 4 × 8 null (or zero) matrix, and D4×8 is defined as

D_4×8=







0.5 0.5 0 0 0 0 0 0

0 0 0.5 0.5 0 0 0 0

0 0 0 0 0.5 0.5 0 0

0 0 0 0 0 0 0.5 0.5





 . (5.3)

x10

x00

x11

x01

Figure 5.1: Four adjacent spatial domain blocks.

In the transform domain, Eq. (5.1) is given by

DCT (xd) = Σ¹_j=0Σ¹_i=0DCT (pi)DCT (xij)DCT (p^T_j). (5.4) Eq. (5.4) provides the equivalent computation of bilinear decimation tech-nique in the transform domain. A typical result from the transform domain operations is shown in Figure 5.2. Even though the matrices p1 and p2 are sparse in the spatial domain, their DCTs (denoted by P1and P2, respectively) are not so sparse, as shown below.

P 1 =

0.453 0.204 −0.034 0.010 0 −0.006 0.014 −0.041

0 0.49 0 0 0 0 0 −0.098

−0.159 0.388 0.237 −0.041 0 0.027 −0.098 −0.077

0 0 0.462 0 0 0 −0.191 0

0.106 −0.173 0.355 0.204 0 −0.136 −0.147 0.034

0 0 0 0.416 0 −0.278 0 0

−0.090 0.136 −0.173 0.360 0 −0.240 0.072 −0.027



−0.453 0.204 0.034 0.010 0 −0.006 −0.014 −0.041

0 −0.49 0 0 0 0 0 0.098

0.159 0.388 −0.237 −0.041 0 0.027 0.098 −0.077

0 0 0.462 0 0 0 −0.191 0

−0.106 −0.173 −0.355 0.204 0 −0.136 0.147 0.034

0 0 0 −0.416 0 0.278 0 0

0.090 0.136 0.173 0.360 0 −0.240 −0.072 −0.027



(a) (b) (c)

Figure 5.2: Image halving using linear distributive and unitary properties of DCT: (a) original image, (b) bilinear decimation in spatial domain, and (c) downsampled image in the transform domain (58.24 dB with respect to the image in (b) with JPQM as 8.52).

Hence, if a fast DCT (and IDCT) algorithm [157] is employed in spatial domain processing, it requires less computation compared to the above ap-proach.

5.1.2 Using Convolution-Multiplication Properties

In this approach similar to the resizing operations in spatial domain, images are subjected to low-pass filtering before downsampling or after upsampling. In Chapter 3, we have already discussed how filtering is performed directly in the DCT domain by using the convolution multiplication property. For example, in [94], during the downsampling operation, low-pass filtering is applied using the third relationship of Eq. (2.94) of Chapter 2. The relevant relationship in 1-D is restated below.

C1e(x ⊛ y(n − 1)) = √

2N C2e(x(l)) C2e(y(m)) ,

0 ≤ l, m ≤ N − 1 , −1 ≤ n < N. (5.7) Using the above, we compute the filtered output in the compressed domain.

Next, by applying the downsampling property of DCT coefficients (refer to Theorem 2.11 in Section 2.2.4.2 of Chapter 2), the downsampled coefficients in the type-II DCT space are computed. For the doubling operation, the DCT co-efficients are computed using Theorem 2.12 (refer to Section 2.2.4.2), which is followed by a low-pass filtering in the DCT domain. In [132], these techniques are further refined by considering the spatial relationship of adjacent blocks. It should be mentioned here that truncation or zero-padding of DCT coefficients itself is a kind of low-pass filtering operation, which also takes care of the adjustment of coefficients through the upsampling or downsampling processes as described in Theorems 2.11 and 2.12 of Chapter 2.

In [62], instead of a convolution-multiplication property of DCTs, a multiplication-convolution property is used. In this case, downsampling and upsampling operations are shown as the sum of multiplication operations of the sample values with given windows in the spatial domain. The multiplica-tion operamultiplica-tions are efficiently computed in the transform domain using sym-metric convolution in the DCT domain (refer to the fourth relationship of Eq. (2.94) in Chapter 2). For the sake of brevity, the concept is discussed for processing DCT blocks in 1-D in the following subsections.

5.1.2.1 Two-fold Downsampling of 8-point DCT Blocks in 1-D Let us consider two adjacent blocks of 8 sample points in the spatial domain and denote the sequences as x1(n), 0 ≤ n ≤ 7 and x2(n), 0 ≤ n < 7. Consider the half-symmetric extension of these two sequences such that for x1(n) the point of symmetry lies in its beginning and the same at the end for x2(n).

The extended sequences are represented in the following forms:

f x1(n) =

x1(n) 0 ≤ n ≤ 7

x1(15 − n) 8 ≤ n ≤ 15 (5.8)

f x2(n) =

x2(7 − n) 0 ≤ n ≤ 7

x2(n − 8) 8 ≤ n ≤ 15 (5.9) This makes the length of the extended sequences 16. To form a concatenated sequence of x1(n) followed by x2(n), we multiply them them window functions w1(n) and w2(n), respectively, as given in the following expression:

x(n) = w1(n)x1(n) + w2(n)x2(n) (5.10) where

w1(n) =

1 0 ≤ n ≤ 7

0 8 ≤ n ≤ 15 (5.11)

and,

w2(n) =

0 0 ≤ n ≤ 7

1 8 ≤ n ≤ 15 (5.12)

In the DCT domain, equivalent operations on X1(k) and X2(k) (DCTs of x1(n) and x2(n)), respectively, are shown in the following expressions. First, DCTs of extended sequences are obtained as follows:

Xf1(k) =

2X1(^k₂) for k even.

0 for k odd. (5.13)

Xf2(k) =

(−1)^k²2X1(^k₂) for k even.

0 for k odd. (5.14)

In the above equations, both fX1(k) and fX2(k) are 16-point DCTs of x1(n) and x2(n). Hence, the equivalent concatenation operation as described in Eq.

(5.10) is performed using the convolution multiplication theorem in the de-notes the skew circular convolution. Finally, for downsampling 16-point X(k) to 8-point coefficients, we perform the truncation operation with scaling as dis-cussed in Chapter 2. Hence, representing DCT coefficients by column vectors Xd and X, respectively, the downsampled DCT coefficients are given by

Xd= I8 08

X. (5.16)

In [62], how the above computation could be performed efficiently by iden-tifying several redundancies has been discussed. From Eqs. (5.15) and (5.16) equivalent computation is expressed as follows [62].

Xd= 1

In the above equation, due to the constraint of space, the first four columns of each row are shown in the top rows of the matrix and its bottom eight rows contain the remaining four columns corresponding to each of them. Further, S in Eq. (5.17) is defined below:

S =

5.1.2.2 Twofold Upsampling of 8-point DCT Blocks in 1-D

While upsampling an 8-point DCT block X(k), 0 ≤ k ≤ 7, first it is converted into a 16-point block by zero-padding and appropriate scaling to eX(k) as follows:

X(k) =e

√2X(k) 0 ≤ k ≤ 7

0 8 ≤ k ≤ 15 (5.20)

Let the upsampled block in the spatial domain be x(n). We obtain two 8-point blocks by using the same window functions in the following way:

y1(n) = x(n)w1(n), 0 ≤ n ≤ 7,

y2(n) = x(n + 8)w2(n + 8), 0 ≤ n ≤ 7. (5.21) Equivalent operations in DCT space to obtain the DCT of y1(n) and y2(n) are shown in the following expressions:

Y1(k) = ¹₄( eX(2k)sW1(2k)), 0 ≤ k ≤ 7,

Y2(k) = (−1)^{k 1}4( eX(2k)sW2(2k)), 0 ≤ k ≤ 7. (5.22) The above equations use the relationship of DCT coefficients between the n-point DCT coefficients with their upsampled versions (of 2n-n-point DCT) when the sequence is extended in the spatial domain with trailing zeroes or leading zeroes. The above computation is expressed in matrix notation as follows [62]:

Y1 = ₂^√¹₂(W₁^uX),

W₁^u also follows the same notation of W₁^d regarding representation of rows.

5.1.2.3 Example in 2-D

The above concept is extended in 2-D by exploiting the separability prop-erty of DCT. Let Xi,j, 1 ≤ i, j ≤ 2 be four adjacent 8 × 8 DCT blocks. Eq.

(5.17) is extended to 2-D in the following form:

Xd= 1

32(PdX11P_d^T + PdX12Q^T_d + QdX21P_d^T+ QdX22Q^T_d), (5.25)

In document Image and Video Processing (Page 194-200)