Image Filtering
3.5 Filtering 2-D Images
The methods discussed above for 1-D are easily extended to 2-D if the impulse response is separable. However, for nonseparable cases, we need to employ dif-ferent strategies. We discuss below the filtering of images with both separable and nonseparable filters in the DCT domain.
3.5.1 Separable Filters
For separable filters, a finite impulse h(m, n) can be written in the form of h(m, n) = h1(m)h2(n), −L ≤ m, n ≤ L. Let us represent the (i, j)th input block as xi,j in 2-D, and its corresponding type-II DCT as Xi,j. The input-output theorems in 1-D can easily be extended to 2-D. We present below the extension of Theorem 3.9 as it is stated in a more generic framework.
Theorem 3.11 Let h(m, n) = h1(m)h2(n), −L ≤ m, n ≤ L be the finite im-pulse response of a separable filter. Let us express each kernel function (h1(n) or h2(n)) of the separable function as a sum of a pair of symmetric and anti-symmetric functions as described below:
h1s(n) = 12(h1(n) + h1(−n)), h1a(n) = 12(h1(n) − h1(−n)), h2s(n) = 12(h2(n) + h2(−n)), h2a(n) = 12(h2(n) − h2(−n)).
(3.44)
Let E1s, F1s, and G1s be the filter matrices corresponding to the response h1s(n) formed according to Theorem 3.5. Similarly, E2s, F2s, and G2s are formed from h2s(n). For the antisymmetric component h1a(n)( h2a(n)), these are E1a (E2a), F1a (F2a), and G1a (G2a), respectively (from Theorem 3.7).
Table 3.5: Per-pixel computational cost of filtering in 2-D Impulse response type nm na
Symmetric 3N 3N + 2
Antisymmetric 3N 3N + 2
Causal 4N 4N − 2
Noncausal arbitrary 6N 6N − 2
Then, the DCT of the filtered output of (i, j)th block, Yi,j, is given by the following expression: Due to the separability property of the impulse response, filtered outputs are computed in two stages. First, the computation is carried out along the vertical direction, and next, it is applied along the horizontal direction. Hence, for separable responses, the computation in Theorem 3.11 is expressed in the following equivalent form:
For each stage of computation we adopt similar computation strategies of 1-D as charted inTable 3.3. Thus, per-pixel computational costs for different types of filter responses are given in Table 3.5. Similar strategies are also adopted for other methods [79, 165] in 1-D.
3.5.1.1 Sparse Computation
In the block DCT space, we may opt for computing with a sparse DCT block.
As in most cases, higher frequency components are of smaller magnitudes, they may not be considered during computation, and only first N2 ×N2 low-frequency coefficients are used for this purpose. This may be applicable to a few selected blocks, even at times for all the input blocks. This type of computation is termed here as sparse computation. In the overlapping and add
Table 3.6: Per-pixel computational cost of filtering in 2-D using the ASC Impulse response type nm na
Symmetric 3N8 3N+2016 Antisymmetric 3N8 3N+2016
Causal N2 N −12
Noncausal arbitrary 3N4 3N−24
Table 3.7: Per-pixel computational cost of filtering in 2-D using the SNC Impulse response type nm na
Symmetric 5N4 5N4 − 1 Antisymmetric 5N4 5N4 − 1
Causal 9N4 9N+64
Noncausal arbitrary 5N2 5N+22
method discussed here, we consider two variations of this sparse computation.
First, all the input DCT blocks are taken as sparse. This is referred to as all sparse computation (ASC). In the second variation, the neighboring blocks are considered as sparse. In the latter case, all the DCT coefficients of the central block are used in the computation. This variation is called sparse neighbor computation (SNC). We also refer to the nonsparse computation, that is, filtering with all the coefficients of every DCT block, as full block computation (FBC). Per-pixel computational costs of these two techniques are given in Tables 3.6 and 3.7, respectively.
3.5.1.2 Computation through Spatial Domain
As explained in Chapter 1, the cost of computation through the spatial domain should include those of inverse and forward transformation of the DCT. In this regard, the computational costs of these transforms are accounted for by the efficient technique as reported in [82]. According to this method, the required number of multiplications (nm) and additions (na) for 2-D N × N DCT transform are given below.
nm = N22log2N,
na = 3N22log2N − N2+ N. (3.47)
Table 3.8: Per-pixel computational cost of filtering through convolution in the spatial domain
Impulse response type nm na
Symmetric 2N + 2 + log2N 2N + 3log2N− 2 +N2
Antisymmetric 2N + log2N 2N + 3log2N− 4 +N2
Causal 2N + 2 + log2N 2N + 3log2N− 2 +N2
Noncausal arbitrary 4N + 2 + log2N 4N + 3log2N− 2 +N2
Let a separable filter of nonuniform kernel h(m, n) = h1(m)h2(n), −N ≤ m, n ≤ N, be applied to an image. In the spatial domain, the convolution operation is efficiently implemented by a sequence of three operations, namely, shift, multiplication, and addition. These are carried out for each nonzero h1(m) and h2(n) in two stages. In the first stage, the shifts are along vertical directions and in the second stage, these are applied in horizontal directions on the data obtained from the first stage. Following this implementation per-pixel computational complexities of these filters are shown in Table 3.8.
From Tables 3.5 and 3.8, it is observed that, for small block sizes, sym-metric or antisymsym-metric filtering in the block DCT domain perform better than the spatial domain technique. However, the latter is superior to DCT filtering for more general types of impulse responses. However, the ASC or the SNC technique (see Tables 3.6 and 3.7) offers significant savings in the computation. It is of interest to know how the quality of the results suffers due to them. This is discussed in the next subsection with typical examples of filtering.
3.5.1.3 Quality of Filtered Images with Sparse Computation The quality of filtered output is judged with reference to the output obtained by the linear convolution operation in the spatial domain. In this regard, the peak signal-to-noise ratio (PSNR) is an appropriate measure. Moreover, to observe the blocking and blurring artifacts due to the introduction of discon-tinuities at boundaries of 8 × 8 blocks, the JPEG quality metric (JPQM) is used. As discussed previously (see Section 1.7.2.2), the higher the value of JPQM, the better is the quality of the image, and for an image with good visual quality, the JPQM value should be close to 10 or higher.
The results presented here are obtained by applying separable filters of uniform kernels in the form of g(m, n) = h(m)h(n). Typically, we have chosen a set of kernels as given inTables 3.9, 3.10,3.11, and3.12for different types of filter response. For each case, the FBC, the ASC, and the SNC algorithms are used. The PSNR and JPQM values obtained on application of these filters over a typical image Lena (of size 256 × 256) are shown inTable 3.13.
From Table 3.13, it is observed that the PSNR values obtained from the
Table 3.9: A noncausal symmetric FIR
n -3 -2 -1 0 1 2 3
h(n) 121 −16 13 12 13 −16 121
Table 3.10: A noncausal antisymmetric FIR
n -3 -2 -1 0 1 2 3
h(n) -121 16 -13 0 13 −16 121
FBC are very high (> 300 dB), implying that the FBC in the block DCT space is an equivalent technique of the linear convolution in the spatial domain. We also observe that all the JPQM values obtained from these techniques are significantly high. However, these values vary depending on filter response and input data. We also compare PSNR values obtained from the ASC and the SNC. As the PSNR values obtained from the ASC are significantly less than those obtained from the SNC, the latter technique is better suited for different applications. Though its computational cost is higher than that of the ASC, it is substantially low compared to the FBC, and computationally more efficient than filtering through spatial domain operations (seeTables 3.8 and3.7).
3.5.2 Nonseparable Filters
Due to the nonseparability of the response, it is not possible to translate the convolution–multiplication property in 2-D into a linear form. Rather, in this case, we need to perform point-wise multiplication between the coefficients in the transform space. We restrict our discussion to filtering with symmet-ric nonseparable finite impulse responses. Let h(m, n), −L ≤ m, n ≤ L be denoted as the impulse response. As it is symmetric, h(m, n) = h(−m, −n).
Hence, the specification in any quadrant in the 2-D space is sufficient enough to describe the complete response. Let us denote the response in the first quad-rant as h+(m, n) = h(m, n), 0 ≤ m, n ≤ L. For block filtering, we follow the same strategy of merging the neighboring blocks of an input block. Then, the convolution–multiplication property of 2-D transforms is applied by pointwise multiplication of type-I DCT coefficients of the response and type-II DCT coefficients of the merged DCT block. This property is an extended form of what is shown in the first row of Table 3.1. Finally, block decomposition is performed on the larger filtered block, and the central target block is saved in the resulting output. The above computational steps are summarized in the theorem given below.
Table 3.11: A noncausal FIR
n -3 -2 -1 0 1 2 3
h(n) 121 −16 13 12 14 −121 121
Table 3.12: A causal FIR
n 0 1 2 3
h(n) 12 14 18 18
Theorem 3.12 Let h(m, n), −L ≤ m, n ≤ L be the symmetric finite impulse response, and its response in the first quadrant is denoted as h+(m, n) = h(m, n), 0 ≤ m, n ≤ L. Let Xi,j be the (i, j)th N × N input DCT block.
Then, the DCT of the filtered output of (i, j)th block, Yi,j, is given by the following expression:
Yi,j = P AT(3,N )
A(3,N )
Xi−1,j−1 Xi−1,j Xi−1,j+1 Xi,j−1 Xi,j Xi,j+1
Xi+1,j−1 Xi+1,j Xi+1,j+1
AT(3,N )N H
A(3,N )PT, (3.48) where A(3,N ) is the block composition matrix as discussed earlier. H is the type-I DCT of the response in the first quadrant. In our notation, a matrix formed by the elements h(m, n), a ≤ m ≤ b, c ≤ n ≤ d is represented as {h(m, n)}a≤m≤b,c≤n≤d. Thus H is given by
H = 6N {C1e {h+(m, n)}0≤m,n≤3N
(k, l)}0≤k,l≤3N−1.
Table 3.13: PSNR (in DB) and JPQM values of filtered output using pro-posed techniques
Type h(n) (from FBC ASC SNC
Tables) PSNR JPQM PSNR JPQM PSNR JPQM Symmetric 3.9 302.21 10.89 36.58 11.41 44.25 12.01 Antisymmetric 3.10 329.05 14.62 45.91 19.19 50.44 16.99
Causal 3.12 304.43 10.82 41.60 9.43 53.89 10.67
Noncausal 3.11 304.26 10.93 37.66 11.08 45.92 11.82 arbitrary
P is the selection matrix as given below.
P =
0N IN 0N .
0N and IN are N × N zero and identity matrices, respectively. The operator
Ndenotes the elementwise multiplication.
The above technique [104] follows the overlap and save approach.