Windowing versus Best Tiling for Wavelet Image Compression
Wee Sun Lee
Department of Computer Science
School of Computing
National University of Singapore
July 7, 1999
Abstract
We compare two methods for entropy coding uniformly quantized wavelet coefficients: windowing, which uses local statistics to adapt the probability assignment and best tiling, which aims to compress as well as the best tiling of a subband with rectangular tiles of probability models. We find that on some synthetic images, the best tiling method works considerably better than windowing, but on a set of natural images, the improvement provided by the best tiling method over windowing is very small.
1
Introduction
Wavelet coefficients are known to be large around edges and small in smooth regions of an image. For effective compression of these coefficients, it is desirable to use different probability models for different regions of a wavelet subband. In this paper, we compare the performance of two sequential probability assignment methods which are able to adapt the probability assignment to the region which is being coded. The methods are windowing, which uses the local statistics to estimate the best probability assignment and best tiling, which aims to compress as well as the best tiling of the subband using rectangular tiles of probability models of various sizes.
In sequential probability assignment, the predictor provides probability assignmenta(x i jx i,1 )forx i given x i,1 , wherex i,1
denote the sequence of coefficients (x 0
;:::;x i,1
). The probability assigned to the image (x 0 ;:::;x t )is then p(x 0 ;:::;x t,1 )=a(x 0 )a(x 1 jx 1 )a(x i jx i,1 )a(x t,1 jx t,2 ):
Using an arithmetic coder, the code length,log 2
p(x 0
;:::;x t,1
)can be approached with a very small
redun-dancy. Both windowing and best tiling attempt to provide an appropriate assignment functiona(jx i,1
)such that
a high probability is assigned to the image in order to minimize the code length.
Variants of windowing have previously been used in successful wavelet image coders [3, 6]. The methods essentially assume that the coefficients outside a small window around the coefficient that is being coded are irrelevant. We use a simple variant of windowing which weighs a finite numberN of probability models using
Bayesian weighting assuming that only the pixels inside the window exist.
The comparison class of best tiling of an image using rectangles of probability models was introduced in [5] together with an algorithm with redundancy O (klog
Nn k
) wherekis the number of tiles, N is the number of
probability models used andnnis the size of the image. The rectangles are allowed to be at arbitrary positions,
be of arbitrary heights and be of a finite numberDof widths. The computational complexity of the algorithm is O (D Nn
2 ).
We find that on some synthetic images which can be tiled using a small number of rectangular tiles, the best tiling method significantly outperforms the windowing method. However, on a set of natural images, the improve-ment provided the the best tiling method is quite small.
Both the windowing and the best tiling method does not exploit interband information in compressing the wavelet coefficients. We also compared the performance of the best tiling coder against a zerotree coder (SPIHT
[7]) which exploits a parent-child relationship between wavelet subbands. We find that the zerotree coder performs better on some of the synthetic images but provides approximately the same compression performance on the set of natural images.
2
Windowing
When assigning a probability massa(x i
jx i,1
), windowing considers only members ofx i,1
which falls within a window of neighbouring pixels. We denote these members asx
i,1;w
. GivenN possible models, we use a
weighting method which sets
a(x i jx i,1 )=p(x i jx i,1;w )= P N j1 p( j )p(x i,1;w j j )p(x i j j ) P N j1 p( j )p(x i,1;w j j )
for probability modelsf 1
;:::; N
g.
In this paper, we use a rectangular window of width2w+1, centered around the coefficient being coded with
uniform probability for the apriori probabilityp( j
). For each coefficient, we need to calculatep(x i,1;w
j j
). One
way to calculate this to to calculatelogp(x i,1;w j j ) = P xk2x i,1;w logp(x k j j
). Computationally, the sum of
numbers in a rectangular window can be done very quickly regardless of the window size to give algorithms of complexityO (Nn
2
)when using the raster scan. One way to calculate the sum is to useS 1 (i;j),S 2 (i,w ;j+ w ),S 2 (i,1;n,1)+S 2 (i,1;j+w ),S 2 (i;j,w )+S 2 (i,w ;j,w ), whereS 1
(a;b)is the sum of all
the numbers seen so far during the scan up to(a;b)andS 2
(a;b)is the sum of all the numbers in the rectangle
which which starts at the origin and ends at(a;b)with the origin at the top left hand corner of the image and the
coordinates increasing in the direction of the scan. The functionsS 1and
S
2can be updated recursively in constant
time.
Windowing should work well when the statistics of the image change slowly. However, when a large region can be coded optimally with the same probability model, windowing may be suboptimal since it only considers the statistics in a small region around the coefficient that is being coded.
3
Best Tiling
The best tiling method is derived using the specialist model, first proposed by Blum [2] and studied in [4]. Using the specialist model, a sequential probability assignment algorithm which has redundancy O (klog
Nn k
)with respect
to the class of any tiling of an image withkarbitrary sized rectangular tiles was given in [5]. By redundancy,
we mean the additional code length produced by the algorithm compared to the optimal tiling usingk tiles of
probability models. The bound holds regardless of the input sequence encountered and the value ofkdoes not
have to be known in advance by the algorithm. Hence, the algorithm should perform well whenever a tiling of the image which works well with a small number of tiles exists. The computational complexity of the algorithm isO (Nn
3
). If we restrict the comparison class to rectangles withW discrete widths, which is what we do in this
paper, the computational complexity can be improved toO (WNn 2
).
We first consider the case of tiling annnimage with rectangles of arbitrary height but one fixed widthw.
We assume that the coefficients are processed in a raster scan order. The origin is at the top left hand corner of the image, the vertical coordinate isy, the horizontal coordinate iszand the coordinates increase in the direction
of the scan. Let a
(y ;z ) be the probability vector produced by the algorithm for coding the pixel
(y ;z)and let a j =(a 0 ;:::;a b,1
)be the probability vector associated with model
jwith an alphabet of size
b. From [5], we have a (y ;z ) = P N j=1 Q (y ;z ) j a j P N j=1 Q (y ;z ) j ; whereQ (y ;z ) j
is updated by the following equations:
Z (y ;z ) j (y ;z+w )) = (n,y ) R (y ,1;z +w ) j Z (y ,1;z +w ) j (y,1;z+w ) (n,y+1) +1 R (y ;z ) j Z (y ;z ) j (y ;z)) = (n,y )R (y ;z );w j R (y ,1;z ) j Z (y ,1;z ) j (y,1;z) (n,y+1) +1 ! R (y ;z );w j = 8 > > > > > > < > > > > > > : R (y ;z ) j if z=0 R (y ;z ,1);w j R (y ;z ) j if z<w R (y ;z ,1);w j R (y ;z ) j R (y ;z ,w ) j ifwzn,1 R (y ;z ,1);w j R (y ;z ,w ) j ifz>n,1 R (y ;z ) j = a j x (y ;z ) =a (y ;z );x (y ;z )
The initial conditions needed areQ (0;0) j =nwand Q (y ;0) j = w ,1 X k =0 (n,y ) R (y ,1;k ) j Z (y ,1;k ) j (y,1;k ) (n,y+1) +1 ! :
WithDdistinct widths, we only have to runDdistinct copies of the algorithm and sum the values ofQ (y ;z ) j
for eachj. A minor complication arises at the boundaries of thezcoordinate since multiple rectangles of different
widths which goes beyond the boundaries are in fact equivalent. One simple method of getting around this is to modify the algorithm for all values of widthswother than the largest width in such a way thatR
(y ;z ) j Z (y ;z ) j (y ;z)= 0forz<w,Z (y ;z ) j (y ;z+w )=0forzn,w,1andQ (y ;0) j =0.
4
Simulation Results
For all the simulations, we use 10 probability models: a uniform distribution and nine Laplacian distributions
p(x)= 1 2j e ,jxj= j with j
2f1;2;4;8;16;32;64;128;256gwhich has been discretized according to the
quan-tization intervals. Uniform quanquan-tization with a dead band (bin containing zero is twice as large as the other bins) is used. The tiling method is used with tiles of five different widths: 2,4,8,16 and 32. A five level decomposition with the 9-7 tap biorthogonal spline filters [1] is used.
We performed simulations on 12 images of size 256 256 from GreySet1 of the Waterloo Bragzone
(http://links.uwaterloo.ca/bragzone.base.html). The first six images are synthetic and are shown in Figure 1. The PSNR results at 0.25 bits per pixel are shown in Table 1 for windowing with window size2w+1forw=1;2;4
as well as for the best tiling method.
The tiling method performed significantly better for the images horiz and squares. Visual inspection of the images reveal that these images (and hence their corresponding wavelet coefficients) can be tiled using a small number of tiles. These results are in agreement with the theory which assures us that the algorithm will perform well whenever the subbands can be tiled using a small number of tiles.
The tiling method performed similarly to windowing on the image circles and slightly poorer than windowing on the image crosses. The image circles contains circular objects which can only be tiled using many rectangular tiles. Similarly, crosses contains many diagonal lines which again can only be satisfactorily covered using many rectangular tiles of probability models.
The rest of the images in the set are natural images and are shown in Figure 2. The PSNR results at 0.25 bits per pixel are shown in Table 2 for the best tiling method and for windowing with window size2w+1forw=1;2;4.
For these images, the tiling method has only a small advantage over windowing. We interprete this to mean that the portions of natural images which are suited to rectangular tiling of probability models and yet is poorly modelled using windowing is small. Tiling appears to work well when there are a lot of vertical and horizontal structures in
the image. Tiling also works better when there are large areas of similar statistics which suits the use of large tiles rather than constant size windows.
Visual inspection of the natural images suggests that many of the regions will be better tiled using some polygonal and ellipsoidal tiles. The fact that the tiling method works well when the rectangular tiles assumption is satisfied is encouraging. However, more work will have to be done to find computationally efficient methods for compressing as well as the best tiling using more flexibly shaped tiles.
Figure 1: Synthetic images. In raster scan order, the images are circles, crosses, horiz, slope, squares, text.
Image Window (w=1) Window (w=2) Window (w=3) Tile
circles 31.63 32.44 31.89 32.44 crosses 26.36 26.38 25.87 25.96 horiz 42.95 43.74 42.31 49.79 slope 40.17 40.92 39.38 43.42 squares 50.70 51.63 50.69 57.53 text 14.71 14.81 14.76 15.09
Table 1: PSNR values for synthetic images at 0.25 bits per pixel.
4.1
Intra versus inter subband coding
Zerotree coding methods [8] exploit parent-child relationships in a tree structure of wavelet coefficients where the parent and children are located in different subbands in order to obtain good compression performance. In contrast, the windowing and tiling methods described above use only information from the same subband. However, the windowing and tiling methods are not sensitive to shifts in the position of objects in the image unlike the zerotree method which uses a fixed structure on the wavelet coefficients where the only flexibility comes from the parent-children relationships.
We compare the performance of the tiling method to that of a zerotree compression method SPIHT [7]. The results for the synthetic images are shown in Table 3 while the results for the natural images are shown in Table 4. The results show that the performance of SPIHT is better than the tiling method for the synthetic images. One
Figure 2: Natural images. In raster scan order, the images are bird, bridge, camera, goldhill, lena, montage.
Image Window (w=1) Window (w=2) Window (w=3) Tile
bird 36.55 37.42 37.28 37.70 bridge 23.96 24.17 24.25 24.31 camera 27.15 27.76 27.67 27.88 goldhill 26.49 26.97 27.08 27.08 lena 28.07 28.88 28.88 29.00 montage 29.94 30.18 29.49 30.88
possible reason for this is that the interband information is more useful than the increased flexibility of the tiling method for compressing the synthetic images. Performance on the natural images is similar for the two methods.
Image Tile SPIHT circles 32.44 32.95 crosses 25.96 28.45 horiz 49.79 50.62 slope 43.42 44.07 squares 57.53 61.16 text 15.09 14.40
Table 3: PSNR values for synthetic images at 0.25 bits per pixel.
Image Tile SPIHT bird 37.70 37.75 bridge 24.31 24.33 camera 27.88 27.97 goldhill 27.08 27.10 lena 29.00 28.96 montage 30.88 30.76
Table 4: PSNR values for natural images at 0.25 bits per pixel.
4.2
Larger Images
We also performed simulations on a set of 12 larger images from GraySet2 of the Waterloo Bragzone. The images are shown in Figure 3. The results are shown in Table 5. Again, SPIHT performed better for the graphical image
france while the tiling method performed better for images with strong vertical and horizontal components such as library.
Image HeightWidth Window (w=1) Window (w=2) Window (w=3) Tile SPIHT
barbara 512512 27.79 28.44 28.46 28.44 28.13 boat 512512 29.86 30.76 30.75 30.95 30.97 france 496672 21.37 21.81 21.78 22.57 22.91 frog 498621 24.71 25.10 24.14 25.16 25.33 goldhill 512512 30.03 30.47 30.48 30.56 30.56 lena 512512 32.85 33.95 33.95 34.09 34.11 library 353464 21.22 21.36 21.24 21.73 19.82 mandrill 512512 22.70 23.18 23.31 23.36 23.27 mountain 480640 18,68 18.92 18.96 19.08 19.37 peppers 512512 32.49 33.23 33.09 33.43 33.47 washsat 512512 33.40 34.11 34.20 34.70 34.18 zelda 512512 36.23 37.37 37.44 37.54 37.50
Figure 3: Larger images. In raster scan order, the images are barbara, boat, france, frog, goldhill, lena, library,
5
Conclusions
We have compared the performance of windowing and best tiling for for sequential probability assigment to wavelet coefficients for image compression. The performance of the best tiling method in our simulations agrees with the theoretical result which suggests that the method will work well when the image can be tiled using a small number of rectangular tiles. The performance of best tiling is significantly better than that of windowing for some synthetic images which satisfies the tiling assumption. However, the performance of the two methods are similar for most natural images.
6
Acknowledgements
The image coders used in this paper were modified from the Wavelet Image Compression Construction Kit, written by Geoff Davis, John Danskin and Ray Heasman
(http://www.cs.dartmouth.edu/ gdavis/wavelet/wavelet.html).
References
[1] M. Antonini, M. Barlaud, P. Mathieu, and I. Daubechies. Image coding using wavelet transform. IEEE
Transactions on Image Processing, pages 205–220, April 1992.
[2] Avrim Blum. Empirical support for winnow and weighted-majority based algorithms: results on a calendar scheduling domain. In Proceedings of the Twelfth International Conference on Machine Learning, pages 64–72, 1995.
[3] C. Chrysafis and A. Ortega. Efficient context-based lossy wavelet image coding. In Proc. of Data Compression
Conference, 1997.
[4] Yoav Freund, Robert E. Schapire, Yoram Singer, and Manfred K. Warmuth. Using and combining predictors that specialize. In Proceedings of the Twenty-Ninth Annual ACM Symposium on the Theory of Computing, 1997.
[5] Wee Sun Lee. Compressing as well as the best tiling of an image. Manuscript, July 1999.
[6] S. M. LoPresto, K. Ramchandran, and M. T. Orchard. Image coding based on mixture modelling of wavelet co-efficients and a fast estimation-quantization framework. In Proceedings of the Data Compression Conference, pages 221–230, 1997.
[7] A. Said and W.A. Pearlman. A new, fast, and efficient image codec based on set partitioning in hierarchical trees. IEEE Trans. on Circuit and Systems for Video Technology, 6(3):243–249, 1996.
[8] Jerome M. Shapiro. Embedded image coding using zerotrees of wavelet coefficients. IEEE Transactions on