Lossless Image Compression Using LZW and Adaptive Huffman Coding

(1)

Meera Gangadharan

, IJRIT 10

International Journal of Research in Information Technology (IJRIT)

www.ijrit.com www.ijrit.com www.ijrit.com

www.ijrit.com ISSN 2001-5569

Lossless Image Compression Using LZW and Adaptive Huffman Coding

Meera Gangadharan¹ and Ahammed Siraj K K²

1M.Tech Student, Department of Computer Engineering, Model Engineering College Ernakulam, Kerala, India

[email protected]

2 Associate Professor, Department of Computer Engineering, Model Engineering College Ernakulam, Kerala, India

[email protected]

Abstract

Images play an indispensable role in representing vital information and needs to be saved for further use or can be transmitted over a medium. In order to have efficient utilization of disk space and transmission rate, images need to be compressed. Image compression is the technique of reducing the file size of an image without compromising with the image quality beyond a certain limit. This reduction in file size saves disk/memory space and allows faster transmission of images over a medium. Image compression has been used for a long time and many algorithms have been devised.

In this proposed work an image compression technique is introduced by combining the adaptive Huffman encoding approach along with LZW to increase the compression ratio. This method consists of steps such as splitting the image into bit planes and applying LZW coding and adaptive Huffman coding algorithms to each bit plane so as to obtain a better compression. The system will provide lossless compression with better compression ratio.

Keywords: Lossless Compression, LZW Coding, Adaptive Huffman Coding

1. Introduction

The rapid growth of digital imaging applications, including desktop publishing, multimedia, and high visual definition has increased the need for effective and standardized image compression techniques.

Digital images play a very important role for describing the detailed information. The key obstacle for many applications is the vast amount of data required to represent a digital image directly. The various processes of digitizing the images to obtain it in the best quality for the more clear and accurate information leads to the requirement of more storage space and better storage and accessing mechanism in the form of hardware or software. Therefore, image compression is essential for image storage and image transmission in most cases. The objective of image compression is to reduce the number of bits as much as possible, while keeping the resolution and the visual quality of the reconstructed image as close to the original image as possible. A device (software or hardware) that compresses data is often know as an encoder or coder, whereas a device that decompresses data is known as a decoder. A device that acts as both a coder and decoder is known as a codec.

The image compression techniques are broadly classified into two categories depending whether or not an exact replica of the original image could be reconstructed using the compressed image. These are:

(2)

Meera Gangadharan

, IJRIT 11

• Lossless technique

• Lossy technique

Lossless compression involves with compressing data which, when decompressed, will be an exact replica of the original data. In lossless compression techniques, the original image can be perfectly recovered from the compressed (encoded) image. These are also called noiseless since they do not add noise to the signal (image). It is also known as entropy coding since it use statistics/decomposition techniques to eliminate/minimize redundancy. Lossless compression is used only for a few applications with stringent requirements such as medical imaging. Run length encoding, Huffman encoding, LZW coding, Area coding are the techniques included in lossless compression.

Lossy schemes provide much higher compression ratios than lossless schemes. Lossy schemes are widely used since the quality of the reconstructed images is adequate for most applications. By this

scheme, the decompressed image is not identical to the original image, but reasonably close to it.

Transformation coding, Vector quantization, Fractal coding, Block Truncation Coding, Sub band coding are the techniques used for lossy compression.

Image Compression is the most useful and commercially successful technologies in the field of Digital Image Processing. The numbers of images compressed and decompressed daily are innumerable. In this project we are aiming to achieve an efficient lossless compression of images.

2. Proposed Method

This paper proposes a new lossless image compression method which combines dictionary based LZW technique with adaptive Huffman coding technique. In this proposed method, first the input image is split into bit planes. Then upon each bit planes LZW compression followed by adaptive Huffman coding is applied. As a result we obtain compressed bit planes.

Fig 2.1 Block Diagram for Image Compression

Fig 2.2 Block Diagram for Image Decompression

The method works as follows :

(3)

Meera Gangadharan

, IJRIT 12

The grayscale image is first divided into 8 bit planes by bit plane separation. Then for each bit plane the following steps are performed:

• Step 1: LZW coding algorithm is applied.

• Step 2: For sequence of codes we obtained as the output from first step, adaptive Huffman coding is applied.

After performing these two steps, we will get compressed form of each bit planes. Now for decompressing image these operations are performed in the reverse order:

• First the adaptive Huffman decoding is applied to the codes and then LZW decoding algorithm is applied so as to obtain original bit planes.

• Finally all the bit planes are combined to obtain the reconstructed image.

2.1 Bit plane Separation

Biplane separation is used for highlighting the contribution made to the total image appearance by specific bits. Assuming that each pixel is represented by 8-bits, the image is composed of eight 1-bit planes. Plane (0) contains the least significant bit and plane (7) contains the most significant bit. Only the higher order bits (top four) contain the majority visually significant data. The other bit planes contribute the more subtle details.

Algorithm for Bit plane separation

Input : Graysacle Image Output : Bit planes

GSImage Read(Grayscale Image) GSImageSize Size(GSImage) for i 0 to 7 do

for j 0 to GSImageSize do

Bitplane(i, j) mod(GSImage[j], 2) end for

GSImage GSImage/2 end for

2.2 LZW Compression

LZW compression method was developed originally by Ziv and Lempel, and subsequently improved by Welch[24].The key to LZW is building a dictionary of sequences of symbols (strings) as the data is read and compressed. Whenever a string is repeated, it is replaced with a single code word in the output. At decompression time, the same dictionary is created and used to replace code words with the corresponding strings .

In LZW compression algorithm, the input file that is to be compressed is read symbol by symbol and they are combined to form a string. The process continues till it reaches the end of file. Every new string is assigned some code and stored in dictionary. They can be referred when the string is repeated with that code. The decompression algorithm expands the compressed file.

Applying LZW in individual bit planes achieves better compression than doing the same on a grayscale image. Since there are only two symbols (1 and 0) in bit plane representation, compared to 256 symbols (0255) in grayscale image, there is more chance for repetition among sequence of symbols leading to better compression.

(4)

Meera Gangadharan

, IJRIT 13

Algorithm for LZW Compression

Input : Bitpalnes

Output : LZW Compressed Bitplanes

Initialize the dictionary to contain all symbols of length one. (D=0,1) String FirstInputCharacter

while String != EOF do

char NextInputCharacter

if String + char is in the dictionary then String String + char else

Output the code for String

Add String+char to the dictionary as new entry end if

String char end while

Algorithm for LZW Decompression

Input : LZW Compressed Bitplanes Output : Bitplanes

Initialize the dictionary to contain all symbols of length one. (D=0,1) Old FirstInputCharacter

Output the Symbol for Old while String != EOF do

New NextInputCode

If New is not in the dictionary then String SymbolforOld String String + Char else

String SymbolforNew Output String

end if

char FirstCharOfString

Add Old+Char to the dictionary as new entry Old New

end while

2.3 Adaptive Huffman Coding

The basic idea of Huffman coding [30] is that symbols that occur frequently have a smaller representation than those that occur rarely. The Huffman algorithm requires both the encoder and the decoder to know the frequency table of symbols related to the data being encoding, so it requires the statistical knowledge which is often not available. Even when it is available, it could be a heavy overhead especially when many tables had to be sent. To avoid building the frequency table in advance, an alternative method called the Adaptive Huffman algorithm is introduced. It allows the encoder and the decoder to build the frequency table dynamically according to the data statistics up to the point of encoding and decoding.

A key point is that neither the tree nor its modification needs to be transmitted, because the sender and receiver use the same modification algorithm and thus always have equivalent copies of the tree.

(5)

Meera Gangadharan

, IJRIT 14

Huffman coding requires prior knowledge of the probabilities of the source sequence[30]. If this knowledge is not available, Huffman coding becomes a two pass procedure: the statistics are collected in the first pass and the source is encoded in the second pass. In the Adaptive Huffman coding procedure, neither transmitter nor receiver knows anything about the statistics of the source sequence at the start of transmission. The tree at both the transmitter and the receiver consists of a single node that corresponds to all symbols Not Yet Transmitted (NYT) and has a weight of 0. As transmission progresses, nodes corresponding to symbols transmitted will be added to the tree and the tree is reconfigured using an update procedure. In the process of coding, the probability for the incoming source sequence is assigned as the elements get in to the tree formed. This gives rise to a problem where the elements which come in the initial stages of tree formation having lesser probability hold smaller codes. Thereby, the compression ratio obtained is lesser. Using Adaptive Huffman algorithm[20], we derived probabilities which dynamically changed with the incoming data, through Binary tree construction. Thus the Adaptive Huffman algorithm provides effective compression by just transmitting the node position in the tree without transmitting the entire code. Unlike static Huffman algorithm the statistics of the data need not be known for encoding the data. It is useful for analyzing the relative importance played by each bit of the image.

2.3.1 Encoding Procedure

Initially, the tree at both the encoder and decoder consists of a single node, the NYT node. Therefore, the codeword for the very first symbol that appears is a previously agreed-upon fixed code. After the very first symbol, whenever we have to encode a symbol that is being encountered for the first time, we send the code for the NYT node, followed by the previously agreed-upon fixed code for the symbol. The code for the NYT node is obtained by traversing the Huffman tree from the root to the NYT node. This alerts the receiver to the fact that the symbol whose code follows does not as yet have a node in the Huffman tree. If a symbol to be encoded has a corresponding node in the tree, then the code for the symbol is generated by traversing the tree from the root to the external node corresponding to the symbol [32].

Algorithm for Encoding Procedure

Input: LZW Compressed Bitplanes Output: Compressed Bitplanes

Initialize root node While String != EOF do

Symbol Readthesymbol if Symbol is seen before then

Output the code for path from root node to corresponding leaf node else

Output code for NYT node followed by 16 bit representation of the symbol.

end if

Update the tree end while

2.3.2 Decoding Procedure

As we read in the received binary string, we traverse the tree in a manner identical to that used in the encoding procedure [32]. Once a leaf is encountered, the symbol corresponding to that leaf is decoded. If the leaf is the NYT node, then we check the next 16 bits and output the symbol corresponding to it else output the symbol stored in the leaf node. Once the symbol has been decoded, the tree is updated and the next received bit is used to start another traversal down the tree.

Algorithm for Decoding Procedure

(6)

Meera Gangadharan

, IJRIT 15

Input : Compressed Bit planes Output : LZW Compressed Bit planes

Initialize the root node

Go to the root of constructed tree while String != EOF do

if Current node is not a leaf node then

Read a bit and move right down tree if "1" and left if "0"

else if Current node is a NYT node then Read the next 16 bit character else

Get the character corresponding to the value stored in the leaf node end if

Write out the symbol and update the tree end while

2.3.3 Update Procedure

The update procedure[32] requires that the nodes be in a fixed order. This ordering is preserved by numbering the nodes. The largest node number is given to the root of the tree, and the smallest number is assigned to the NYT node. The numbers from the NYT node to the root of the tree are assigned in increasing order from left to right, and from lower level t upper level. The set of nodes with the same weight makes up a block After a symbol has been encoded or decoded, the external node corresponding to the symbol is examined to see if it has the largest node number in its block. If the external node does not have the largest node number, it is exchanged with the node that has the largest node number in the block, as long as the node with the higher number is not the parent of the node being updated. The weight of the external node is then incremented. If we did not exchange the nodes before the weight of the node is incremented, it is very likely that the ordering required by the sibling property would be destroyed. Once we have incremented the weight of the node, we have adapted the Huffman tree at that level. We then turn our attention to the next level by examining the parent node of the node whose weight was incremented to see if it has the largest number in its block. If it does not, it is exchanged with the node with the largest number in the block. Again, an exception to this is when the node with the higher node number is the parent of the node under consideration. Once an exchange has taken place (or it has been determined that there is no need for an exchange), the weight of the parent node is incremented. We then proceed to a new parent node and the process is repeated. This process continues until the root of the tree is reached. If the symbol to be encoded or decoded has occurred for the first time, a new external node is assigned to the symbol and a new NYT node is appended to the tree. Both the new external node and the new NYT node are offspring of the old NYT node. We increment the weight of the new external node by one. As the old NYT node is the parent of the new external node, we increment its weight by one and then go on to update all the other nodes until we reach the root of the tree.

3. Experimental Results

The proposed method works well for both grayscale images and two tone images of different formats. It is found that more compression is achieved while applying the proposed method on two tone images as it contains more amount of similarity when compared to grayscale images.

The result obtained for the jpeg image “puppy.jpg” is shown below :

(7)

Meera Gangadharan

, IJRIT 16

Fig 3.1 (a) Input Image (b) Reconstructed Image

The original size of the image is 24900 (pixels). After compression we got compressed data of size 16287(values). So compression ratio obtained is 0.654. As this method is a lossless method there are restrictions in obtaining more compression.

The results obtained for the bitmap image ‘man.bmp’ is shown below :

Fig 3.2 (a) Input Image (b) Reconstructed Image

The original image size is 32064(pixels). After compression we got compressed data of size 3848(values).

So compression ratio obtained is 0.120. The proposed method has been implemented on a set of images of different image formats and the obtained results are shown in the Table 3.1 and Table 3.2.

Table 3.1: Compression Rates Obtained for Grayscale Images

Image File Original Size (Pixels)

Compressed Size (Values)

Compression Ratio Percentage of Compression Achieved

MyImage.jpg 17850 12403 0.6948 30.5 %

cameraman.jpg 36864 29717 0.8061 19.3 %

puppy.jpg 24900 16287 0.6540 34.6 %

coin.jpg 28600 9716 0.3397 67 %

babyelephant.jpg 17680 15369 0.8692 13 %

(8)

Meera Gangadharan

, IJRIT 17

Table 3.2: Compression Rates Obtained for Two Tone Images

Image File Original Size (Pixels)

Compressed Size (Values)

Compression Ratio Percentage of Compression Achieved

letter.bmp 32450 2624 0.0861 91.3 %

circles.jpg 50264 7906 0.1572 84.28 %

man.bmp 30450 3848 0.120 88 %

banana.jpg 46812 8309 0.1774 82.26 %

pattern.giff 25600 3505 0.1369 86.4 %

4. Conclusion & Future Work

The major objective of image compression is to reduce or eliminate the data redundancies which may occur when storing an image so that the compressed image size can be minimal. Lossless compression involves with compressing data, which when decompressed, will be an exact replica of the original data.

The proposed idea puts down a method for image compression using LZW and adaptive Huffman coding which tends to provide higher compression ratio. As both the compression techniques that are being used are lossless compression techniques; hence the proposed method can also be called lossless technique (considering no loss of image data in bits). Hence a new and better method for compression is proposed.

The proposed project is implemented only on two tone images and grayscale image. As a future work this method can be extended to color images also.

References

[1] Ming-Bo Lin, Member, IEEE, Jang-Feng Lee, and Gene Eu Jan, "A Lossless Data Compression and Decompression Algorithm and Its Hardware Architecture", IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 9,925-936, SEPTEMBER 2006.

[2] Sukho Lee, Sang-Wook Park, Paul Oh, and Moon Gi Kang, Member, IEEE, "Colorization Based Compression Using Optimization" IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 22, NO. 7, 2627-2636, JULY 2013.

[3] Li Cheng, S.V. N. Vishwanathan, "Learning to Compress Images and Videos", Statistical Machine Learning, National ICT Australia, Research School of Information Sciences and Engineering, Australian National University, Canberra ACT 0200.

[4] Edward J. Delp,Student Member, IEEE, and O. Robert Mitchell, Member, IEEE, "Image Comprssion Using Block Truncation Coding", IEEE TRANSACTIONS ON COMMUNICATIONS, VOL.COM-27, 1335-1342, SEPTEMBER 1979.

[5] Jagadish H. Oujar,Lohit M . Kadlaskar , “ A New Lossless Method of Image Compression and Decompression using Huffman Coding Techniques”, Journal of Theoretical and Applied Information Technology © 2005 - 2010 JATIT.

[6] Prof. Rajendra Kumar Patel, "An Efficient Image Compression Technique Based on Arithmetic Coding", International Journal of Advanced Computer Research, Volume-2 Number-4 Issue-6 December- 2012.

[7] Varun Biala1, Dalveer Kaur, "Lossless Compression Technique Using ILZW with DFRLC", IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 3, No 3, 425-427, May 2012.

(9)

Meera Gangadharan

, IJRIT 18

[9] Compression Using Fractional Fourier Transform", A Thesis Submitted in the partial fulfillment of requirement for the award of the degree of Master of Engineering In Electronics and Communication. By Parvinder Kaur.

[10] Amit Makkar, Gurjit Singh, Rajneesh Narula, Dept. of CSE, Adesh Institute of Engineering, and Technology, Faridkot, Punjab, India, "Improving LZW Compression" International Journal of Computer Science And Technology Vol. 3, Issue 1, 411-414, Jan. - March 2012.

[11] Sonal, Dinesh Kumar, "A STUDY OF VARIOUS IMAGE COMPRESSION TECHNIQUES", Department of Computer Science and Engineering Guru Jhambheswar University of Science and Technology, Hisar.

[12] C. Tharini and P. Vanaja Ranjan,Department of Electrical and Electronics Engineering, College of Engineering, Guindy, Anna University, Chennai, India, "Design of Modified Adaptive Huffman Data Compression Algorithm for Wireless Sensor Network" Journal of Computer Science 5 (6): 466-470, 2009 ISSN 1549-3636 © 2009 Science Publications.

[13] I Made Agus Dwi Suarjaya Information Technology Department Udayana University Bali, Indonesia,

"A New Algorithm for Data Compression Optimization" International Journal of Advanced Computer Science and Applications,

Vol. 3, No.8, 2012

[14] C. Saravanan and R. Ponalagusamy, "Lossless Grey-scale Image Compression using Source Symbols Reduction and Huffman Coding ", International Journal of Image Processing (IJIP), Volume (3) : Issue (5.).

[15] Dr. Monisha Sharma (Professor), Mr. Chandrashekhar K. (Associate Professor), Lalak Chauhan(M.E.

student), "Image Compression Using Huffman Coding Based On Histogram Information And Image Segmentation", International Journal of Engineering Research and Technology (IJERT) ISSN: 2278-0181 Vol. 1 Issue 8, October - 2012.

[16] Anat Levin, Dani Lischinski, Yair Weiss, "Colorization Using Optimization", School of Computer Science and Engineering, The Hebrew University of Jerusalem.

[17] Dorin Comaniciu, Member, IEEE, and Peter Meer, Senior Member, IEEE, "Mean Shift: A Robust Approach Toward Feature Space Analysis" IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 24, NO. 5, 603-619, MAY 2002.

[18] Mridul Kumar Mathur, Seema Loonker, Dr. Dheeraj Saxena, "LOSSLESS HUFFMAN CODING TECHNIQUE FOR IMAGE COMPRESSION AND RECONSTRUCTION USING BINARY TREES", Int.J.Comp.Tech.Appl,Vol 3 (1), 76-79.

[19] Robert G. Gallager, ”Variations on a theme by Huffman”, IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-24, NO. 6, NOVEMBER 1978.

[20] Jeffrey Scott Vitter, Brown University, Providence, Rhode Island, ”Design and Analysis of Dynamic Huffman Codes”, Journal of the association for computing machinery, Vol. 34, No. 4, October 1987, pp.825-845.

(10)

Meera Gangadharan

, IJRIT 19

[21] Joaquín Franco, Gregorio Bernabé, Juan Fernández and Manuel E. Acacio,"A Parallel Implementation of the 2D Wavelet Transform Using CUDA", Parallel Distributed and Network-based Processing.

[22] In Kyu Park, Nitin Singhal, Man Hee Lee, Sungdae Cho, and Chris W. Kim, "Design and Performance Evaluation of Image Processing Algorithms on GPUs", IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 22, NO. 1, JANUARY 2011.

[23] Ross Arnold and Tim Bell, "A corpus for the evaluation of lossless compression algorithms", Department of Computer Science, University of Canterbury, Christchurch, NZ.

[24] Welch, Terry (1984). "A Technique for High-Performance Data Compression", IEEE.

[25] Fang Wei, Qiu Cui, Ye Li, ”Fine-Granular Parallel EBCOT and Optimization with CUDA for Digital Cinema Image Compression”, 2012 IEEE International Conference on Multimedia and Expo.

[26]Pi-Chung Wang, Yuan-Rung Yang, Chun-Liang Lee, Hung-Yi Chang, ”A Memory-efficient Huffman Decoding Algorithm”, 2005 IEEE Proceedings of the 19th International Conference on Advanced Information Networking and Applications (AINA05).

[27] Shyni K, Manoj Kumar KV, "Lossless LZW Data Compression Algorithm on CUDA", IOSR Journal of Computer Engineering (IOSR-JCE), Volume 13, Issue 1 (Jul. - Aug. 2013), PP 122-127.

[28] L. Erodi Smolenice, Slovakia, "File compression with LZO algorithm using NVIDIA CUDA architecture", LINDI 2012 4th IEEE International Symposium on Logistics and Industrial Informatics September 5-7, 2012; Smolenice, Slovakia.

[28] Rafael C. Gonzalez and Richard E. Woods, "Digital Image Processing", Pearson Education, Third edition, 2009.

[29] Ida Mengyi Pu, "Fundamental Data Compression".

[30] S Jayaraman, S Esakirajan and T Veerakumar., "Digital Image Processing", 1/e. Mc Graw Hill education,2009.

[31] Khalid Sayood, "Introduction to Data Compression", Third edition.