cryptosystem that could be used as a standard in unclassified U.S Government applications IBM
5.2 Mutual Information and Unconditionally Secure Systems
Quite often random variables contain information about each other. In cryptosystems, the plaintext and the ciphertext are related through the key. In this section we shall give a formal definition (in the information theoretic sense of the word) of an unconditionally secure cryptosystem
Let X and Y be two random variables, defined on resp. The joint distribution of X and Y is often shortened to just
Similarly, the conditional probability denoted by
It satisfies the relation
occurrence of is already known.
The equivocation or conditional entropy of X given Y is the expected value of over all y. In formula,
Let H(X, Y) be defined analogously to the entropy function H for one variable.
Proof: We use (5.5) and (5.7).
The second equality follows by a symmetry argument.
In words, the above theorem states that the uncertainty about a joint realization of X and Y equals the uncertainty about X plus the uncertainty about Y given X.
Statements ii) and iii) follow directly from i) and the chain rule.
The amount of information (see (5.1) that a realization gives about a possible realization can be quantified as the amount of information that the occurrence of gives minus the amount of information that will give when is already know. We denote this by
It follows that
Note the symmetry in
The mutual information I(X; Y) of X and Y is defined as the expected value of i.e.
Proof: From (5.8) it follows that
The other statements follow from Theorem 5.1.
I ( X ; Y) can be interpreted as the expected amount of information that Y gives about X (or X about Y).
We conclude that the receiver gets 1 - H(p) bits of information about X per received symbol Y. How to approach this quantity 1 - H(p) is the fundamental problem in algebraic coding theory [MacWS77], Section 1.6.
the receiver gets no information about the transmitted symbols, as is to be expected.
Let us now return to the conventional cryptosystem as explained in Chapter 1. Assume that a probability distribution is defined on the keyspace and let the sequence of random
variables
denote the plaintext, and let
denote the ciphertext. So, In most applications will be equal to Since is a one-to-one mapping, the plaintext is uniquely determined by the key and the ciphertext, therefore, one has
Of course the user of the cryptosystem is interested to know how much information leaks about
In words: the uncertainty about the key together with the information that the ciphertext gives about the plaintext is greater than or equal to the uncertainty about the plaintext. Again, this reflects our intuition.
Proof of Theorem 5.4:
By (5.9) and the chain rule (Thm. 5.1, which also applies to conditional entropies) one has that
In words: given the ciphertext the uncertainty about the key is at least as great as the uncertainty about the plaintext. This reflects the property that knowing the ciphertext, one can reconstruct the plaintext from the key, but not necessarily the other way around.
It follows that
and by Theorem 5.3 that
In cryptosystem where all keys and all plaintexts are equally likely, Corollary 5.5 states that you need to have at least as many keys as plaintexts.
Example 5.4
Suppose that we have keys, all with probability Then
If the messages are the outcome of u tossings with a fair coin, one has in a similar way that so, for perfect secrecy one needs
This can be realized the encryption where stands for the first u bits of the key k and where stands for a coordinatewise modulo 2 addition. With this encryption, with each
Problem 5.1
Show that function satisfies properties P1-P4 in Section 5.1.
Problem 5.2
a) Prove that
b) Show that these inequalities imply that
where h(x) is the entropy function defined in (5.4).
Problem 5.3
Assume that the English language has an information rate of 1.5 bits per letter. What is the unicity distance of the Caesar cipher, when applied to an English text?
Answer the same question for the Vigenère cryptosystem with key length r.
Problem 5.4
Consider a memoryless message source that generates an output letter X that is uniformly distributed over the alphabet {0, 1, 2}.
After transmission over a channel the symbol Y, that is received, will be equal to X with probability 1 – p, and it will be equal to any of the other two letters in the alphabet with probability
Compute the mutual information I ( X , Y) between X and Y.
Problem 5.5
Let be a plaintext source that generates independent, identical distributed letters X from {a, b, c, d}. The probability distribution is given by
Consider the two coding schemes:
The output sequence of the plaintext X is first converted into a {0, l}-sequence by means of one of the above coding schemes and subsequently encrypted with the DES algorithm.
What is the unicity distance for both coding schemes?
Problem 5.6