Error correction with binary symmetric channels and equal source frequencies

Coding Theory

4.7 Error correction with binary symmetric channels and equal source frequencies

The case of equal source frequencies and a binary symmetric channel is a very important special case because it is the case we think we are in, in a great number of real, practical situations in the world today. Or, perhaps we just hope and assume that we are in this situation; see 4.5.7 andSection 3.1.

By Corollary 4.5.6, when the channel is binary and symmetric with reli-ability p> 1/2 and the source frequencies are equal, MLD and NCWD coin-cide. Thus, for eachw ∈ {0,1}received, we decode by examining the words w1,...,wm∈ {0,1}in the encoding scheme and picking the one, if any, closer tow in the Hamming distance sense than are any of the other wi. In this section we will see a way that this procedure might be simplified, at the cost of some reliability.

As remarked in 4.5.7, the situation described in the title of this section is the setting of most of coding theory, which is mainly about binary block codes.

We shall not go far into that theory, but during our excursion we shall observe its customs. For one thing, we shall refer to the set C= {w1,...,wm} ⊆ {0,1} of code words appearing in the encoding scheme as the code, and all mention of the source alphabet and of the encoding scheme will be suppressed. This is not unreasonable in the circumstances, since our decoding method is NCWD;

the only thing we need to know about the source alphabet is its size, m≤ 2. Definitions The operation+ is defined on {0,1} by 0+0 = 0, 0+1 = 1+0 = 1, and 1+ 1 = 0. The operation + is then defined on {0,1}coordinatewise, given the definition above. [For example, with = 5, 01101 + 11110 = 10011.]

The Hamming weight of a wordw ∈ {0,1} is wt(w) = number of ones appearing inw. [For example, wt(10110) = 3.]

Ifw1,...,wm∈ {0,1}are distinct words, the distance of the code C= {w1, ...,wm} is

d(C) = min

1≤i< j≤mdH(wi,wj) = min

w,v∈C w=v

dH(w,v)

4.7.1 For u,v ∈ {0,1}, note that u+v has ones precisely where u and v differ.

Thus dH(u,v) = wt(u + v).

4.7.2 Verify that wt(u + v) ≤ wt(u) + wt(v), for all u,v ∈ {0,1}.

Definition We will say that a code C⊆ {0,1}corrects the error pattern u∈ {0,1}, if and only if, for eachw ∈ C, NCWD will decode w + u as w (or, as whatever source letterw represents).

The u appearing in the last definition above could be any binary word of length. When we call u an error pattern we are thinking that, during the transmission of a binary word of length through the channel, errors occurred at precisely those places in the word marked by 1’s in u. Thus, by the definition of+, if w was transmitted and the error pattern u occurred, the word received at the receiving end of the channel would bew + u. Thus the definition above says that C corrects u if and only if, whenever the error pattern u occurs and the code C is in use, NCWD (= MLD) will correctly decode the received word, whicheverw ∈ C was sent.

4.7.3 Example Let C= {00000,11100,01111}. Verify that C corrects: 00000, 10000, 01000, 00100, 00010, 00001, and no other error patterns. Note that if 11100 is transmitted and the error pattern 01010 occurs, then 10110 will be received, which is closer to 11100 than to either of the other two code words;

but if 00000 or 01111 is transmitted and that error pattern occurs, NCWD will decide not to decode. Thus that error pattern is not corrected by the code.

4.7.4 Let C = {0⁶,0³1³,1³0³,1⁶}. Verify that the set of error patterns cor-rected by C is{0⁶} ∪ {all 6 binary words of length 6, of Hamming weight 1} ∪ {100100,100010,100001,010100,010010,010001,001100,001010,001001}.

4.7.5 Theorem SupposeC ⊆ {0,1} and|C| ≥ 2. ThenC corrects all error patterns of length, of Hamming weight≤ (d(C) − 1)/2.

Proof: Suppose u∈ {0,1}and wt(u) ≤ (d(C) − 1)/2. Suppose that w,v ∈ C andw = v. Then

d(C) ≤ dH(w,v) ≤ dH(w,w + u) + dH(w + u,v)

= wt(w + (w + u)) + dH(w + u,v)

= wt(u) + dH(w + u,v)

4.7 Error correction with binary symmetric channels and equal source frequencies 113

which implies

dH(w + u,v) ≥ d(C) − wt(u) ≥ d(C) −d(C) − 1 2

=d(C) + 1

2 >d(C) − 1

2 ≥ wt(u) = dH(w,w + u).

Thus, for eachw ∈ C,w is the unique word in C closest to w + u, so NCWD will decodew + u as w.

4.7.6 Corollary Lett= ^d^(C)−1₂ . Then, withEdenoting the maximum error probability withCin use, with a binary symmetric channel with reliabilityp>

1/2,

E≤ 1 −

t j=0

p^{− j}(1 − p)^j.

Proof: Let U= {u ∈ {0,1};wt(u) ≤ t}, and, for each v ∈ C, Nv= {w ∈ {0,1}; NCWD decodesw as v}. By the theorem,

v + U = {v + u;u ∈ U} ⊆ N_v, for each v ∈ C.

Therefore, for eachv ∈ C, with w denoting “the received word,”

P(w ∈ N_v| v is sent) ≥ P(w ∈ v + U | v is sent)

= P (the error pattern u lies in U | v is sent)

= P(u ∈ U)

= P t or fewer errors occurred, in trials, with probability 1− p of error on each trial

t j=0

p^{− j}(1 − p)^j, by Theorem 1.5.7.

Sincev ∈ C is arbitrary, the desired conclusion follows.

Definition Suppose C ⊆ {0,1} and|C| ≥ 2. Let d = d(C). In simplified nearest code word decoding (SNCWD), a received wordw ∈ {0,1}is decoded asv ∈ C if and only if dH(v,w) ≤^d⁻¹₂ . If there is no suchv ∈ C, do not decode w.

By the proof of Theorem 4.7.5, for eachw ∈ {0,1}there is at most one v ∈ C such that dH(v,w) = wt(v + w) ≤ (d(C) − 1)/2. Observe that if u = v + w then v = w + u, because of the peculiar definition of +. Consequently, the carrying out of SNCWD can proceed as follows: givenw, start calculating the wordsv + w,v ∈ C, until you run across one of weight ≤ (d(C) − 1)/2. If you have saved thev, report that as the intended code word. Alternatively, v can be recovered fromv + w and w by addition. If there is no v ∈ C for which wt(v + w) ≤ (d(C) − 1)/2, report “do not decode.”

Is this procedure any easier than plain old NCWD? From the naive point of view, no. In both procedures you have to calculate dH(v,w),v ∈ C until either av is found for which dH(v,w) ≤ (d(C) − 1)/2, or until all v ∈ C have been tried, at which point, with NCWD, the numbers dH(v,w) must be compared.

You save a little trouble with SNCWD by omitting this last comparison; but surely, you might think, this saving would not compensate us sufficiently for the loss of reliability incurred by forsaking NCWD for SNCWD.

But the fact is that SNCWD is very commonly used. The details are beyond the scope of this course. Suffice it to say that knowing exactly which error patterns will be corrected sometimes leads, in the presence of certain algebraic properties of C, to very efficient decoding procedures.

If we define “C corrects the error pattern u” for SNCWD as it was defined for NCWD, it is easy to see that, with SNCWD, the error patterns corrected by C are precisely the words of weight ≤ (d(C) − 1)/2; furthermore, the error patterns corrected correctly are the same for different code words—see Example 4.7.3 to see that this is not necessarily the case with NCWD. Note also Example 4.7.4 and compare the error patterns corrected there by NCWD with the error patterns corrected by SNCWD.

Reliability For C⊆ {0,1}, let R(C, p) denote the reliability of the code-and-channel system obtained by using C, a binary symmetric code-and-channel with relia-bility p, and NCWD. (As elsewhere in this section, the source frequencies are assumed to be equal.) Let RS(C, P) denote the reliability when SNCWD is used. Corollary 4.7.6 implies that

R(C, p) ≥

^d^(C)−1₂

j=0

p^{− j}(1 − p)^j.

4.7.7 Proposition RS(C, P) =^d^(C)−1₂ j=0

p^{− j}(1 − p)^j.

The proof, after that of 4.7.5 and the remarks above, is straightforward.

R0(C, p) and (RS)0(C, p) will denote, as in Exercise 4.5.7, the relaxed reliabilities obtained by not considering a “do not decode” message to be an error.

Exercises 4.7

1. Express explicitly, as formulas in p, the reliabilities R(C, p), RS(C, p), R0(C, p), and (RS)0(C, p), when C is the code of Example 4.7.3.

2. Same question for Example 4.7.4.

3. There are two famous binary block codes, of lengths 23 and 24, called the Golay code and the extended Golay code, respectively. Let us denote the Golay code by C23, and the extended Golay code by C24. Their distances are d(C23) = 7 and d(C24) = 8.

In document Introduction to Information Theory and Data Compression (Page 119-123)