The information rate of a code

Coding Theory

4.8 The information rate of a code

Both codes have the remarkable property that NCWD and SNCWD coin-cide when these codes are in use. C23has the further property that there is never a “do not decode” result. This is not the case with C24, however, and, in fact, every error pattern of weight 4 occurring when C24 is in use will result in a “do not decode” message.

(a) Express R(C23, p) and R(C24, p) explicitly as functions of p, assum-ing p> 1/2.

(b) Show that R(C23, p) > R(C24, p) for 1/2 < p < 1, but that R0(C23, p) = R(C23, p) < R0(C24, p) for 1/2 < p < 1.

(c) |C23| = |C24|, both codes come equipped with very fast NCWD algo-rithms, and the slightly greater length of C24is a negligible drawback;

so, what do the results of part (b), above, suggest to you about which of the two codes you would choose to use? In some circumstances you would take C23 over C24, and in other circumstances it would be the other way around. What sort of consideration decides the choice? Be brief.

4. C⊆ {0,1} and the channel is binary and symmetric with reliability p.

Show that t= ^d^(C)−1₂  is the largest integer among the integers i with the property that C will correct (using NCWD) all error patterns of weight≤ i.

4.8 The information rate of a code

For a binary code C⊂ {0,1}, the information rate of C is generally defined to be(log2|C|)/. To see what this really means, and how to generalize it to variable-length encoding schemes over possibly non-binary code alphabets, suppose that C is used to encode a source alphabet S with m= |C| letters, of equal relative frequencies. Then H(S) = log2m= log2|C|; this is the number of bits of information carried by each code word. Since each code word is bits long, the number of bits per bit, so to speak, carried by the code words is (log2|C|)/. To put it another way, (log2|C|)/ is the rate at which the code words are carrying information, in bits per input (code) letter.

The account preceding rests on the assumption that the relative source fre-quencies are equal. In the more general situation when we have a source al-phabet S,|S| = m, with possibly unequal relative frequencies, and a uniquely decodable scheme for S→ A, where A is a code alphabet with |A| = n ≥ 2, the discussion above is adaptable to give the result that the information rate of the code, interpreted as the average amount of (source) information carried by the code words, per code letter, is H(S)/ ¯, where ¯ is the average code word length, computed from the scheme and the relative source frequencies. The units of in-formation are determined by the choice of the base of the log appearing in the computation of H(S). To accord with the binary case, we may as well adopt the convention that that base is to be n= |A|.

If A is the input alphabet of a channel, at what rate, then, is information originating from the source appearing at the receiver of the channel, per out-put letter, given a uniquely decodable scheme for S→ A? Since there is one output letter per input letter, and I(A, B) gives the rate of flow of information through the channel, i.e., the average number of units of information arriving at the receiver for each unit of information transmitted, it follows that the aver-age amount of information from the source to the receiver, per output letter, is (H (S)/ ¯)(I (A, B)|p₁,...,pn), with log = lognand p1,..., pncomputed from the encoding scheme and the relative source frequencies, as in section 4.4.

Both H(S)/ ¯ and this quantity multiplied by I (A, B)|p₁,...,pn are indica-tors of an encoding scheme’s efficacy, but let us be under no illusions as to the sensitivity of these indices. Notice that H(S)/ ¯ is increased only by de-creasing ¯, within the requirement of unique decodability; clearly different uniquely decodable encoding schemes with the same ¯ can have very different qualities. In particular, in the case of fixed-length encoding schemes, this in-dex leaves error-correction facility and encoding/decoding speed and efficiency completely out of account. The index(H (S)/ ¯)(I (A, B)|p1,...,pn) is somewhat more interesting—it goes up as ¯ decreases and/or as the p1,..., pn resulting from the scheme better approximate the optimal input frequencies of the chan-nel. But it still leaves out of account code qualities of practical interest. (See Exercise 4.8.3.)

This does not mean that these indices are useless! Consider that knowing the area of a planar figure tells you nothing about the shape or other geometric and topological properties of the figure. Does that mean that we should give up on the parameter we call “area?” Just so, the two parameters we are discussing here will have their uses in the discussion and comparison of code-and-channel systems. We need to be aware of the limitations of these discussions and com-parisons, but if we are aware, then let’s proceed! From here on, we will refer to H(S)/ ¯ as the (pretransmission) information rate of the code involved, and (H (S)/ ¯)(I (A, B)|p₁,...,pn) as the (post-transmission) information rate of the code-and-channel system.

Given S, f1,..., fm, A, B, Q, and a fixed-length uniquely decodable scheme for S→ A, sj → wj ∈ A, j= 1,...,m, as in Section 4.5, there is another pa-rameter associable to the code-and-channel system that might be preferable to (H (S)/)(I (A, B)) as a measure of information flow from the source to the channel receiver, per output letter: I(S, B)/. Here S and B stand for the obvious systems of events associable to the multistage probabilistic experiment described in section 4.5. The mutual information I(S, B) is divided by , above, to make the result comparable to(H (S)/)(I (A, B)|p₁,...,pm) as a mea-sure of information conveyed per output letter.

Shannon’s interpretation of I(A, B) as measuring the average amount of information conveyed by the channel (given certain relative input frequencies) per input letter (seeSection 3.4) transfers to an interpretation of I(S, B), in this more complicated situation, as the average amount of information conveyed by

4.8 The information rate of a code 117

the code-and-channel system, per source letter, given f1,..., fm and a fixed-length encoding scheme for S→ A. Thus I (S, B)/ would seem to be a mea-sure of rate of information flow from the source to the channel receiver, per output letter, that is more reflective of error correction concerns and a gener-ally more sensitive index of the efficacy of the code-and-channel system than is (H (S)/)(I (A, B)).

However, I(S, B)/ as an index of goodness suffers from a grave defect:

it is frightfully difficult to calculate, even in the simplified circumstances of binary block codes with equal source frequencies and a binary symmetric chan-nel. Suppose we are in these circumstances and, in addition, the relative input frequencies generated by the use of the code are p0= p1= 1/2. (This is a common circumstance in practice. For instance, 0 and 1 occur equally often when the Golay codes, mentioned in Exercise 4.7.3, are used with equal source frequencies. The same holds for any linear block code – see [30] for definitions – containing the word with all ones, and almost all of the commonly used bi-nary block codes are of this sort.) Then(H (S)/)I (A, B) =^log²^|C|(p log22 p+ (1 − p)log22(1 − p)), where C ⊆ {0,1} is the code (|C| = m = |S|) and p is the reliability of the channel. That is, the post-transmission information rate is just the conventional information rate times the channel capacity. Meanwhile, I(S, B) is a daunting sum of m · 2terms – see Exercise 4.8.2. For particular binary block codes this expression can be greatly simplified – but not enough to put it in the category of(log2|C|)/, d(C), or itself as easily consulted indices of the quality of a binary block code, in standard circumstances. One could argue that the difficulty of calculating I(S, B) is the price you pay for the subtle power of this index. But an indicator that is harder to calculate than the items of interest that it might be an indicator of, like reliability and error probability, is not a useful indicator.

Still, I(S, B) is an important and interesting number associated with a fixed-length code-and-channel system, and an academic study of its behavior and its relation to other indicators may bring some rewards. Here is a question for anyone who might be interested: by Corollary 2.4.6, I(S, B) ≤ H (S); is it necessarily the case that I(S, B) ≤ H (S)I (A, B)? I (A, B) here is, of course, calculated with the relative input frequencies produced by the encoding scheme and the relative source frequencies.

Exercises 4.8

1. Calculate H(S)/ and (H (S)/)(I (A, B)) for each of the fixed-length code-and-channel systems in Exercise 4.5.1.

Write this out as a function of p, using log= log2, in case m = 4, fj = 1/4, j = 1,...,m, {w1,w2,w3,w4} = {0000,0011,1100,1111}, and the channel is binary symmetric with reliability p. Compare I(S, B) to H (S) and to H(S)I (A, B) in this case.

3. Describe a binary block code C of length 23 with the same information rate and post-transmission information rate as the Golay code, C23, mentioned in Exercise 4.7.3, with d(C) = 1. [You need to know that |C23| = 2¹²and that 0 and 1 occur equally often, overall, in the code words of C23. Assume equal source frequencies.]

Chapter 5 Lossless Data Compression by

In document Introduction to Information Theory and Data Compression (Page 123-127)