Technique 7.2 Constructing a rule base
10. Image Compression
10.3 Huffman Coding
In 1952, a paper by David Huffman was published presenting Huffman coding. This technique was the state of the art until about 1977. The beauty of Huffman codes is that variable length codes can achieve a higher data density than fixed length codes if the characters differ in frequency of occurrence. The length of the encoded character is inversely proportional to that character's frequency. Huffman wasn't the first to discover this, but his paper presented the optimal algorithm for assigning these codes.
Huffman codes are similar to the Morse code. Morse code uses few dots and dashes for the most frequently occurring letter. An E is represented with one dot. A T is represented with one dash. Q, a letter occurring less frequently is represented with dash-dash-dot-dash.
Huffman codes are created by analyzing the data set and assigning short bit streams to the datum occurring most frequently. The algorithm attempts to create codes that minimize the average number of bits per character. Table 9.1 shows an example of the frequency of letters in some text and their corresponding Huffman code. To keep the table manageable, only letters were used. It is well known that in English text, the space character is the most frequently occurring character.
As expected, E and T had the highest frequency and the shortest Huffman codes. Encoding with these codes is simple. Encoding the word toupee would be just a matter of stringing together the appropriate bit strings, as follows:
T 0 U P E E 111 0100 10111 10110 100 100
One ASCII character requires 8 bits. The original 48 bits of data have been coded with 23 bits achieving a compression ratio of 2.08.
Letter Frequency Code
A 8.23 0000 B 1.26 110000 C 4.04 1101 D 3.40 01011
E 12.32 100
F 2.28 11001
Introduction to Image Processing and Computer Vision by LUONG CHI MAI http://www.netnam.vn/unescocourse/computervision/computer.htm
G 2.77 10101 H 3.94 00100 I 8.08 0001 J 0.14 110001001 K 0.43 1100011 L 3.79 00101 M 3.06 10100 N 6.81 0110 O 7.59 0100 P 2.58 10110 Q 0.14 1100010000 R 6.67 0111
S 7.64 0011 T 8.37 111 U 2.43 10111 V 0.97 0101001 W 1.07 0101000 X 0.29 11000101 Y 1.46 010101 Z 0.09 1100010001 Table 10.1 Huffman codes for the alphabet letters.
During the codes creation process, a binary tree representing these codes is created. Figure 10.4 shows the binary tree representing Table 10.1. It is easy to get codes from the tree. Start at the root and trace the branches down to the letter of interest. Every branch that goes to the right represents a 1. Every branch to the left is a 0. If we want the code for the letter R, we start at the root and go left-right-right-right yielding a code of 0111.
Using a binary tree to represent Huffman codes insures that our codes have the prefix property. This means that one code cannot be the prefix of another code. (Maybe it should be called the non-prefix property.) If we represent the letter e as 01, we could not encode another letter as 010. Say we also tried to represent b as 010. As the decoder scanned the input bit stream 0 10 .... as soon as it saw 01, it would output an e and start the next code with 0. As you can expect, everything beyond that output would be garbage. Anyone who has debugged software dealing with variable length codes can verify that one incorrect bit will invalidate all subsequent data. All variable length encoding schemes must have the prefix property.
Introduction to Image Processing and Computer Vision by LUONG CHI MAI http://www.netnam.vn/unescocourse/computervision/computer.htm
0 1
A I
H L
S O N R
W
E
V
D
Y M F
C U T
G P
Z
X J B
K
Q
Figure 10.3 Binary tree of alphabet.
The first step in creating Huffman codes is to create an array of character frequencies. This is as simple as parsing your data and incrementing each corresponding array element for each character encountered. The binary tree can easily be constructed by recursively grouping the lowest frequency characters and nodes. The algorithm is as follows:
1. All characters are initially considered free nodes.
2. The two free nodes with the lowest frequency are assigned to a parent node with a weight equal to the sum of the two free child nodes.
3. The two child nodes are removed from the free nodes list. The newly created parent node is added to the list.
4. Steps 2 through 3 are repeated until there is only one free node left. This free node is the root of the tree.
When creating your binary tree, you may run into two unique characters with the same frequency. It really doesn't matter what you use for your tie-breaking scheme but you must be consistent between the encoder and decoder.
Let's create a binary tree for the image below. The 8 x 8 pixel image is small to keep the
Introduction to Image Processing and Computer Vision by LUONG CHI MAI http://www.netnam.vn/unescocourse/computervision/computer.htm
example simple. In the section on JPEG encoding, you will see that images are broken into 8 x 8 blocks for encoding. The letters represent the colors Red, Green, Cyan, Magenta, Yellow, and Black (Figure 10.4).
Figure 10.4 Sample 8 x 8 screen of red, green, blue, cyan, magenta, yellow, and black pixels.
Before building the binary tree, the frequency table (Table 10.2) must be generated.
Figure 10.5 shows the free nodes table as the tree is built. In step 1, all values are marked as free nodes. The two lowest frequencies, magenta and yellow, are combined in step 2. Cyan is then added to the current sub-tree; blue and green are added in steps 4 and 5. In step 6, rather than adding a new color to the sub-tree, a new parent node is created.
This is because the addition of the black and red weights (36) produced a smaller number than adding black to the sub-tree (45). In step 7, the final tree is created. To keep consistent between the encoder and decoder, I order the nodes by decreasing weights. You will notice in step 1 that yellow (weight of 1) is to the right of magenta (weight of 2). This protocol is maintained throughout the tree building process (Figure 10.5). The resulting Huffman codes are shown in Table 10.3.
When using variable length codes, there are a couple of important things to keep in mind.
First, they are more difficult to manipulate with software. You are no longer working with ints and longs. You are working at a bit level and need your own bit manipulation routines. Also, variable length codes are more difficult to manipulate inside a computer. Computer instructions are designed to work with byte and multiple byte objects. Objects of variable bit lengths introduce a little more complexity when writing and debugging software. Second, as previously described, you are no longer working on byte boundaries. One corrupted bit will wipe out the rest of your data. There is no way to know where the next codeword begins.
With fixed-length codes, you know exactly where the next codeword begins.
Introduction to Image Processing and Computer Vision by LUONG CHI MAI http://www.netnam.vn/unescocourse/computervision/computer.htm
Color Frequency red 19 black 17 green 16 blue 5 cyan 4 magenta 2 yellow 1
Table 10.2 Frequency table for Figure 10.5
red 00 black 01 green 10 blue 111 cyan 1100 magenta 11010 yellow 11011
Table 10.3 Huffman codes for Figure 10.5.
Introduction to Image Processing and Computer Vision by LUONG CHI MAI http://www.netnam.vn/unescocourse/computervision/computer.htm
M Y
3
R K
12 1
2
7 5 4 3
6
19 17 16 5 4 2 1 R K G B C M
19 17 16 5 4 R K G B C
19 17 16 R K G
19 17 R K
R K G
B C
M Y
M Y
C
B G
28
M Y
C
B 12
M Y
C 7
M Y
C
B G
28
Introduction to Image Processing and Computer Vision by LUONG CHI MAI http://www.netnam.vn/unescocourse/computervision/computer.htm
Figure 10.5 Binary tree creation.
One drawback to Huffman coding is that encoding requires two passes over the data. The first pass accumulates the character frequency data, which is then compressed on the second pass.
One way to remove a pass is to always use one fixed table. Of course, the table will not be optimized for every data set that will be compressed. The modified Huffman coding technique in the next section uses fixed tables.
The decoder must use the same binary tree as the encoder. Providing the tree to the decoder requires using a standard tree that may not be optimum for the code being compressed.
Another option is to store the binary tree with the data. Rather than storing the tree, the character frequency could be stored and the decoder could regenerate the tree. This would increase decoding time. Adding the character frequency to the compressed code decreases the compression ratio.
The next coding method has overcome the problem of losing data when one bit gets corrupted. It is used in fax machines which communicate over noisy phone lines. It has a synchronization mechanism to minimize data loss to one scanline.