Mutual Information and Unconditionally Secure Systems

cryptosystem that could be used as a standard in unclassified U.S Government applications IBM

5.2 Mutual Information and Unconditionally Secure Systems

Quite often random variables contain information about each other. In cryptosystems, the plaintext and the ciphertext are related through the key. In this section we shall give a formal definition (in the information theoretic sense of the word) of an unconditionally secure cryptosystem

Let X and Y be two random variables, defined on resp. The joint distribution of X and Y is often shortened to just

Similarly, the conditional probability denoted by

It satisfies the relation

occurrence of is already known.

The equivocation or conditional entropy of X given Y is the expected value of over all y. In formula,

Let H(X, Y) be defined analogously to the entropy function H for one variable.

Proof: We use (5.5) and (5.7).

The second equality follows by a symmetry argument.

In words, the above theorem states that the uncertainty about a joint realization of X and Y equals the uncertainty about X plus the uncertainty about Y given X.

Statements ii) and iii) follow directly from i) and the chain rule.

The amount of information (see (5.1) that a realization gives about a possible realization can be quantified as the amount of information that the occurrence of gives minus the amount of information that will give when is already know. We denote this by

It follows that

Note the symmetry in

The mutual information I(X; Y) of X and Y is defined as the expected value of i.e.

Proof: From (5.8) it follows that

The other statements follow from Theorem 5.1.

I ( X ; Y) can be interpreted as the expected amount of information that Y gives about X (or X about Y).

We conclude that the receiver gets 1 - H(p) bits of information about X per received symbol Y. How to approach this quantity 1 - H(p) is the fundamental problem in algebraic coding theory [MacWS77], Section 1.6.

the receiver gets no information about the transmitted symbols, as is to be expected.

Let us now return to the conventional cryptosystem as explained in Chapter 1. Assume that a probability distribution is defined on the keyspace and let the sequence of random

variables

denote the plaintext, and let

denote the ciphertext. So, In most applications will be equal to Since is a one-to-one mapping, the plaintext is uniquely determined by the key and the ciphertext, therefore, one has

Of course the user of the cryptosystem is interested to know how much information leaks about

In words: the uncertainty about the key together with the information that the ciphertext gives about the plaintext is greater than or equal to the uncertainty about the plaintext. Again, this reflects our intuition.

Proof of Theorem 5.4:

By (5.9) and the chain rule (Thm. 5.1, which also applies to conditional entropies) one has that

In words: given the ciphertext the uncertainty about the key is at least as great as the uncertainty about the plaintext. This reflects the property that knowing the ciphertext, one can reconstruct the plaintext from the key, but not necessarily the other way around.

It follows that

and by Theorem 5.3 that

In cryptosystem where all keys and all plaintexts are equally likely, Corollary 5.5 states that you need to have at least as many keys as plaintexts.

Example 5.4

Suppose that we have keys, all with probability Then

If the messages are the outcome of u tossings with a fair coin, one has in a similar way that so, for perfect secrecy one needs

This can be realized the encryption where stands for the first u bits of the key k and where stands for a coordinatewise modulo 2 addition. With this encryption, with each

Problem 5.1

Show that function satisfies properties P1-P4 in Section 5.1.

Problem 5.2

a) Prove that

b) Show that these inequalities imply that

where h(x) is the entropy function defined in (5.4).

Problem 5.3

Assume that the English language has an information rate of 1.5 bits per letter. What is the unicity distance of the Caesar cipher, when applied to an English text?

Answer the same question for the Vigenère cryptosystem with key length r.

Problem 5.4

Consider a memoryless message source that generates an output letter X that is uniformly distributed over the alphabet {0, 1, 2}.

After transmission over a channel the symbol Y, that is received, will be equal to X with probability 1 – p, and it will be equal to any of the other two letters in the alphabet with probability

Compute the mutual information I ( X , Y) between X and Y.

Problem 5.5

Let be a plaintext source that generates independent, identical distributed letters X from {a, b, c, d}. The probability distribution is given by

Consider the two coding schemes:

The output sequence of the plaintext X is first converted into a {0, l}-sequence by means of one of the above coding schemes and subsequently encrypted with the DES algorithm.

What is the unicity distance for both coding schemes?

Problem 5.6

6.1 Basic Concepts of Source Coding for Stationary Sources

In document Fundamentals of Cryptology A Professional Reference & Interactive Tutorial pdf (Page 95-102)