5.3 A Novel Block-Oriented Decomposition Approach
5.3.1 Multiscale Decomposition Algorithm
The proposed algorithm is based on the multiscale algorithm of Chan et al. where in their paper, the colour coherence vector is used to extract features from each sub- image. Because colour property does not change when re-scaling the images without maintaining the aspect ratio of the width and the height of the image, Chan et al. resize all the database images to dyadic sizes to facilitate easier image cropping and re-scaling. This means that at the lowest level, the database image will always be of size 64×64, and hence is the same size as the sub-image patch. If we are to use the multiscale approach to texture retrieval, a modification is necessary since re-scaling the image without maintaining the aspect ratio of the image’s width and height tends to totally alter the properties of the underlying texture. We would like to make sure all texture properties at every level remain unaltered, except for the scales, so that the retrieved images pose a fair resemblance to the query image.
The proposed algorithm is described below. The sub-image patch used is the same as proposed by Chan et al. that is 64×64, since in real applications, we believe the query image should not be smaller than this size. Consider a texture with size 256×256. The dyadic size of the texture means that 16 sub-images will fit into the whole image at the root level as shown in Figure 5.6(a). However, there is no overlapping between sub-images, and one might argue that better localisation can be achieved by using an overlapping sub-images. Figure 5.6(b) shows that the additional 33 sub-images well placed inside the whole image, to make up a total of 49 sub-images for the overlapping case. While the overlapping approach is better localised, it requires more sub-images, which means more computation for each scale. Throughout this chapter, both overlap- ping and non-overlapping approaches will be investigated for their performance.
The number of sub-images, K generated for the non-overlapped and overlapped cases at any single level can be computed respectively as:
K= Width pixels 64 × Height pixels 64 (5.1) K= µ Width pixels 64 ∗2−1 ¶ × µ Height pixels 64 ∗2−1 ¶ (5.2) Now we will consider the case where the size of the image is not of dyadic integer, but instead any random integer value. In our multiscale image decomposition, the size of the image to be processed remains unchanged at the first level. When performing sub- image localisation, we have to allow some overlapping between sub-images even for the non-overlapped case to make sure the sub-images are evenly distributed. The number of sub-images, K for the overlapped and non-overlapped case for any single level is
(a)
(b)
Figure 5.6: (a) Non-overlapped sub-images, (b) Additional sub-images for the over- lapped case
therefore calculated respectively as:
K = » Width pixels 64 ¼ × » Height pixels 64 ¼ (5.3) K= µ» Width pixels 64 ¼ ∗2−1 ¶ × µ» Height pixels 64 ¼ ∗2−1 ¶ (5.4) wheredeis the rounded up operator. Thedeoperator ensures the sub-images are inter- connected and no sections of the image will be left out. For example, consider an image with size 193×332. For the non-overlapped case, the image will contain §19364¨ = 4 sub-images in the row direction and §33264¨= 6 sub-images in the column direction. For the overlapped case, the number of sub-images will be 4×2−1 = 7 and 6×2−1 = 11 in the row and column direction respectively. The amount of overlapping can be computed as below for the overlapped and non-overlapped cases respectively:
Amount of overlapping = 64− lwidth pixels−64 width pixels 64 m −1 (5.5) Amount of overlapping = 64− l width pixels−64 width pixels 64 m ×2−2 (5.6)
Figure 5.7 shows the variation of overlapping amount with different sizes of image for the two approaches. From the figure, the minimum amount of overlapping for the non- overlapped case is 0, while the minimum amount of overlapping for the overlapped case is 32. The minimum overlapping is achieved when sizes are in multiples of 64. The
maximum overlapping is achieved when either the width or the height of the image has a size of 65 (2 sub-images, the first sub-image takes the first 64 pixels, and the second sub-image takes the last 64 pixels). Since both approaches involves overlapping of sub-images for most of the image dimension, it is necessary to rename the originally non-overlapped case. From this point onwards, the originally non-overlapped case will be referred to ascase 1 overlapping, and the original overlapped case will be referred as case 2 overlapping.
Figure 5.7: Amount of pixels overlapping for (left) originally non-overlapped case, and (right) overlapped case
Now that the sub-image coverage of the first scale is configured, the image re-scaling process will be now discussed. As mentioned previously, the first level of the decom- position involves the original dimension of the image to be processed. The re-scaling of the image can be described as follows. For an image with M ×N dimensions, the minimum of the two dimensions,min(M, N) is taken as the basis for re-scaling. Let us say the row,M is the minimum of the two dimensions. Then the image is re-scaled to the nearest dyadic integer that is smaller thanM, while maintaining the aspect ratio of the width and height of the image. The sub-image decomposition described previously is then performed on the re-scaled image to get the sub-images corresponding to the second level. Starting from the second level, to obtain the parent image at the following level, the image is just re-scaled by a factor of 2. This process continues untilmin(M, N) reached 64. For example, consider an image with size 783×556. The following illustrates the image dimensions at each level.
• First level: 783×556
• Second level: 721×512 (512 is the nearest dyadic integer smaller thanmin(783,556))
• Third level: 361×256
• Fourth level: 181×128
The lowest level (91×64) now consists of 2 sub-images for case 1 overlapping and 3 sub-images forcase 2 overlapping. In general, for anM×N image, the number of scales can be computed as:
Number of scales= » log(min(M, N)) 2log2 ¼ (5.7)
Recall in chapter 4, we came across the problem of image padding in order to perform discrete wavelet frames decomposition. It was found that the periodic padding should be used if the translation invariance property is to be maintained. However, using multiscale image decomposition technique, we are dealing with image blocks and not actually separate entities. Therefore one might argue that the border information can be extracted from neighbouring image blocks by borrowing border pixels in the filter- ing operation. Done this way, the discrete wavelet frames decomposition and feature extraction processes are no longer independent for each spatial block, but the exchange offers an elegant solution for padding, and the order of operation can be reversed. First, the entire image is decomposed using wavelet filtering, and then the patches can be slid across the stack of DWF coefficients in order to compute the features for each sub-image. After the image is re-scaled to appropriate size, the DWF decomposition is applied once again and so on until the lowest scale image. This means, the DWF decomposition only has to be applied k times, where k is the number of scales (once for each level), instead of applying it to each sub-image generated by the multiscale decomposition. The parent images at each scale however will need to be padded using a periodic padding. The feature extraction is also simplified using this technique, where the standard deviation and the number of zero-crossings are computed straight away within each block, like sliding a standard deviation and zero-crossings operators over a stack of images, as shown in Figure 5.8.
Figure 5.8: A cube is slid on a stack of DWF coefficient images to compute the standard deviation and zero-crossings
However, since we are subtracting the mean of the image before applying the wavelet frames decomposition (recall chapter 4), this could pose a potential problem. The pur- pose of subtracting the image mean is to make sure the image is zero-mean, and therefore
brightness invariance can be ensured. Applying the wavelet frames decomposition im- plies that the mean to be subtracted is the global mean, and not the local mean for a particular texture. As a result, not only the brightness invariance property is lost, but the retrieved images might also be inaccurate. To confirm this, we will evaluate both approaches (DWF followed by block decomposition and block decomposition followed by DWF) in the experimental section. Figures 5.9 and 5.10 show the flowchart of the two different approaches of the proposed multiscale image decomposition technique.
Get a sub-image, perform DWF decomposition and compute its
feature vector
Add sub-image's feature vector to the final feature vector
Finish with all sub- images? min(M,N)=64? No Yes No Yes
Final Feature Vector Half image by
a factor of j/i, where:
i=min(M,N), j=nearest dyadic integer that
is smaller than i,
j<i
Image, I(M,N)
Divide image into several 64x64 sub-images
Figure 5.9: Flowchart of the proposed multiscale image decomposition technique (Block decomposition followed by DWF)