• No results found

Constructing CPA-Secure Encryption Schemes

Private-Key Encryption

3.5 Constructing CPA-Secure Encryption Schemes

Before constructing encryption schemes secure against chosen-plaintext at-tacks, we first introduce the important notion of pseudorandom functions.

3.5.1 Pseudorandom Functions and Block Ciphers

Pseudorandom functions (PRFs) generalize the notion of pseudorandom generators. Now, instead of considering “random-looking” strings we con-sider “random-looking” functions. As in our earlier discussion of pseudo-randomness, it does not make much sense to say that any fixed function f : {0, 1}→ {0, 1} is pseudorandom (in the same way that it makes little sense to say that any fixed function is random). Thus, we must instead refer to the pseudorandomness of a distribution on functions. Such a distribution is induced naturally by considering keyed functions, defined next.

A keyed function F : {0, 1}× {0, 1} → {0, 1} is a two-input function, where the first input is called the key and denoted k. We say F is effi-cient if there is a polynomial-time algorithm that computes F (k, x) given k and x. (We will only be interested in efficient keyed functions.) In typ-ical usage a key k is chosen and fixed, and we are then interested in the single-input function Fk :{0, 1}→ {0, 1} defined by Fk(x) = F (k, x). The security parameter n dictates the key length, input length, and output length.

That is, we associate with F three functions `key, `in, and `out; for any key k∈ {0, 1}`key(n), the function Fk is only defined for inputs x∈ {0, 1}`in(n), in which case Fk(x)∈ {0, 1}`out(n). Unless stated otherwise, we assume for sim-plicity that F is length-preserving, meaning `key(n) = `in(n) = `out(n) = n.

That is, by fixing a key k ∈ {0, 1}n we obtain a function Fk mapping n-bit input strings to n-bit output strings.

A keyed function F induces a natural distribution on functions given by choosing a uniform key k∈ {0, 1}nand then considering the resulting single-input function Fk. We call F pseudorandom if the function Fk (for a uniform key k) is indistinguishable from a function chosen uniformly at random from the set of all functions having the same domain and range; that is, if no efficient adversary can distinguish—in a sense we more carefully define below—whether it is interacting with Fk (for uniform k) or f (where f is chosen uniformly from the set of all functions mapping n-bit inputs to n-bit outputs).

Since choosing a function at random is less intuitive than choosing a string at random, it is worth spending a bit more time on this idea. Consider the set Funcn of all functions mapping n-bit strings to n-bit strings. This set is finite, and selecting a uniform function mapping n-bit strings to n-bit strings means choosing an element uniformly from this set. How large is Funcn? A function f is specified by giving its value on each point in its domain. We can view any function (over a finite domain) as a large look-up table that stores

f (x) in the row of the table labeled by x. For f ∈ Funcn, the look-up table for f has 2n rows (one for each point of the domain {0, 1}n), with each row containing an n-bit string (since the range of f is{0, 1}n). Concatenating all the entries of the table, we see that any function in Funcn can be represented by a string of length 2n· n. Moreover, this correspondence is one-to-one, as each string of length 2n· n (i.e., each table containing 2n entries of length n) defines a unique function in Funcn. Thus, the size of Funcn is exactly the number of strings of length n· 2n, or|Funcn| = 2n·2n.

Viewing a function as a look-up table provides another useful way to think about selecting a uniform function f ∈ Funcn: It is exactly equivalent to choosing each row in the look-up table of f uniformly. This means, in par-ticular, that the values f (x) and f (y) (for any two inputs x6= y) are uniform and independent. We can view this look-up table being populated by random entries in advance, before f is evaluated on any input, or we can view entries of the table being chosen uniformly “on-the-fly,” as needed, whenever f is evaluated on a new input on which f has not been evaluated before.

Coming back to our discussion of pseudorandom functions, recall that a pseudorandom function is a keyed function F such that Fk (for k ∈ {0, 1}n chosen uniformly at random) is indistinguishable from f (for f∈ Funcnchosen uniformly at random). The former is chosen from a distribution over (at most) 2n distinct functions, whereas the latter is chosen from all 2n·2n functions in Funcn. Despite this, the “behavior” of these functions must look the same to any polynomial-time distinguisher.

A first attempt at formalizing the notion of a pseudorandom function would be to proceed in the same way as in Definition 3.14. That is, we could require that every polynomial-time distinguisher D that receives a description of the pseudorandom function Fk outputs 1 with “almost” the same probability as when it receives a description of a random function f . However, this definition is inappropriate since the description of a random function has exponential length (given by its look-up table of length n· 2n), while D is limited to running in polynomial time. So, D would not even have sufficient time to examine its entire input.

The definition therefore gives D access to an oracle O which is either equal to Fk(for uniform k) or f (for a uniform function f ). The distinguisher D may query its oracle at any point x, in response to which the oracle returnsO(x).

We treat the oracle as a black box in the same way as when we provided the adversary with oracle access to the encryption algorithm in the definition of a chosen-plaintext attack. Here, however, the oracle computes a deterministic function and so returns the same result if queried twice on the same input.

(For this reason, we may assume without loss of generality that D never queries the oracle twice on the same input.) D may interact freely with its oracle, choosing its queries adaptively based on all previous outputs. Since D runs in polynomial time, however, it can ask only polynomially many queries.

We now present the formal definition. (The definition assumes F is length-preserving for simplicity only.)

DEFINITION 3.25 Let F : {0, 1}× {0, 1} → {0, 1} be an efficient, length-preserving, keyed function. F is a pseudorandom function if for all probabilistic polynomial-time distinguishers D, there is a negligible function negl such that:

Pr[DFk(·)(1n) = 1]− Pr[Df (·)(1n) = 1] ≤ negl(n),

where the first probability is taken over uniform choice of k∈ {0, 1}n and the randomness of D, and the second probability is taken over uniform choice of f ∈ Funcn and the randomness of D.

An important point is that D is not given the key k. It is meaningless to require that Fk be pseudorandom if k is known, since given k it is trivial to distinguish an oracle for Fk from an oracle for f . (All the distinguisher has to do is query the oracle at any point x to obtain the answer y, and compare this to the result y0:= Fk(x) that it computes itself using the known value k.

An oracle for Fk will return y = y0, while an oracle for a random function will have y = y0 with probability only 2−n.) This means that if k is revealed, any claims about the pseudorandomness of Fk no longer hold. To take a concrete example, if F is a pseudorandom function, then given oracle access to Fk (for uniform k) it must be hard to find an input x for which Fk(x) = 0n (since it would be hard to find such an input for a truly random function f ). But if k is known, finding such an input may be easy.

Example 3.26

As usual, we can gain familiarity with the definition by looking at an insecure example. Define the keyed, length-preserving function F by F (k, x) = k⊕ x.

For any input x, the value of Fk(x) is uniformly distributed (when k is uni-form). Nevertheless, F is not pseudorandom since its values on any two points are correlated. Concretely, consider the distinguisher D that queries its or-acle O on arbitrary, distinct points x1, x2 to obtain values y1 =O(x1) and y2 =O(x2), and outputs 1 if and only if y1⊕ y2 = x1⊕ x2. IfO = Fk, for any k, then D outputs 1. On the other hand, ifO = f for f chosen uniformly from Funcn, then the probability that f (x1)⊕ f(x2) = x1⊕ x2 is exactly the probability that f (x2) = x1⊕ x2⊕ f(x1), or 2−n, and D outputs 1 with this probability. The difference is|1 − 2−n|, which is not negligible. ♦

Pseudorandom Permutations/Block Ciphers

Let Permn be the set of all permutations (i.e., bijections) on{0, 1}n. View-ing any f ∈ Permn as a look-up table as before, we now have the added constraint that the entries in any two distinct rows must be different. We have 2n different choices for the entry in the first row of the table; once we fix this entry, we are left with only 2n− 1 choices for the second row, and so on. We thus see that the size of Permn is (2n)!.

Let F be a keyed function. We call F a keyed permutation if `in= `out, and furthermore for all k∈ `key(n) the function Fk :{0, 1}`in(n)→ {0, 1}`in(n) is one-to-one (i.e., Fk is a permutation). We call `in the block length of F . As before, unless stated otherwise we assume F is length-preserving and so `key(n) = `in(n) = n. A keyed permutation is efficient if there is a polynomial-time algorithm for computing Fk(x) given k and x, as well as a polynomial-time algorithm for computing Fk−1(y) given k and y. That is, Fk should be both efficiently computable and efficiently invertible given k.

The definition of what it means for an efficient, keyed permutation F to be a pseudorandom permutation is exactly analogous to Definition 3.25, with the only difference being that now we require Fk to be indistinguishable from a uniform permutation rather than a uniform function. That is, we require that no efficient algorithm can distinguish between access to Fk (for uniform key k) and access to f (for uniform f ∈ Permn). It turns out that this is merely an aesthetic choice since, whenever the block length is sufficiently long, a random permutation is itself indistinguishable from a random function. Intuitively this is due to the fact that a uniform function f looks identical to a uniform permutation unless distinct values x and y are found for which f (x) = f (y), since in such a case the function cannot be a permutation. However, the probability of finding such values x, y using a polynomial number of queries is negligible. (This follows from the results of Appendix A.4.)

PROPOSITION 3.27 If F is a pseudorandom permutation and addition-ally `in(n)≥ n, then F is also a pseudorandom function.

If F is a keyed permutation then cryptographic schemes based on F might require the honest parties to compute the inverse Fk−1 in addition to com-puting Fk itself. This potentially introduces new security concerns. In par-ticular, it may now be necessary to impose the stronger requirement that Fk

be indistinguishable from a uniform permutation even if the distinguisher is additionally given oracle access to the inverse of the permutation. If F has this property, we call it a strong pseudorandom permutation.

DEFINITION 3.28 Let F :{0, 1}× {0, 1} → {0, 1} be an efficient, length-preserving, keyed permutation. F is a strong pseudorandom permuta-tion if for all probabilistic polynomial-time distinguishers D, there exists a negligible function negl such that:

Pr[DFk(·),Fk−1(·)(1n) = 1]− Pr[Df (·),f−1(·)(1n) = 1] ≤ negl(n), where the first probability is taken over uniform choice of k∈ {0, 1}n and the randomness of D, and the second probability is taken over uniform choice of f ∈ Permn and the randomness of D.

Of course, any strong pseudorandom permutation is also a pseudorandom permutation.

Block ciphers. In practice, block ciphers are designed to be secure instan-tiations of (strong) pseudorandom permutations with some fixed key length and block length. We discuss approaches for building block ciphers, and some popular candidate block ciphers, in Chapter 6. For the purposes of the present chapter the details of these constructions are unimportant, and for now we simply assume that (strong) pseudorandom permutations exist.

Pseudorandom functions and pseudorandom generators. As one might expect, there is a close relationship between pseudorandom functions and pseudorandom generators. It is fairly easy to construct a pseudorandom generator G from a pseudorandom function F by simply evaluating F on a series of different inputs; e.g., we can define G(s) def= Fs(1)kFs(2)k · · · kFs(`) for any desired `. If Fswere replaced by a uniform function f , the output of G would be uniform; thus, when using F instead, the output is pseudorandom.

You are asked to prove this formally in Exercise 3.14.

More generally, we can use the above idea to construct a stream cipher (Init, GetBits) that accepts an initialization vector IV . (See Section 3.3.1.) The only difference is that instead of evaluating Fson the fixed input sequence 1, 2, 3, . . ., we evaluate F on the inputs IV + 1, IV + 2, . . . .

CONSTRUCTION 3.29

Let F be a pseudorandom function. Define a stream cipher (Init, GetBits), where each call to GetBits outputs n bits, as follows:

• Init: on input s ∈ {0, 1}n and IV ∈ {0, 1}n, set st0:= (s, IV ).

• GetBits: on input sti= (s, IV ), compute IV0:= IV + 1 and set y := Fs(IV0) and sti+1:= (s, IV0). Output (y, sti+1).

A stream cipher from any pseudorandom function/block cipher.

Although stream ciphers can be constructed from block ciphers, dedicated stream ciphers used in practice typically have better performance, especially in resource-constrained environments. On the other hand, stream ciphers appear to be less well understood (in practice) than block ciphers, and confidence in their security is lower. It is therefore recommended to use block ciphers (possibly by converting them to stream ciphers first) whenever possible.

Considering the other direction, a pseudorandom generator G immedi-ately gives a pseudorandom function F with small block length. Specifi-cally, say G has expansion factor n· 2t(n). We can define the keyed function F : {0, 1}n× {0, 1}t(n) → {0, 1}n as follows: to compute Fk(i), first com-pute G(k) and interpret the result as a look-up table with 2t(n) rows each containing n bits; output the ith row. This runs in polynomial time only if t(n) =O(log n). It is possible, though more difficult, to construct pseudoran-dom functions with large block length from pseudoranpseudoran-dom generators; this is

shown in Section 7.5. Pseudorandom generators, in turn, can be constructed based on certain mathematical problems conjectured to be hard. The exis-tence of pseudorandom functions based on these hard mathematical problems represents one of the amazing contributions of modern cryptography.

3.5.2 CPA-Secure Encryption from Pseudorandom Functions We focus here on constructing a CPA-secure, fixed-length encryption scheme.

By what we have said at the end of Section 3.4.2, this implies the existence of a CPA-secure encryption scheme for arbitrary-length messages. In Section 3.6 we will discuss more efficient ways of encrypting messages of arbitrary length.

A naive attempt at constructing a secure encryption scheme from a pseu-dorandom permutation is to define Enck(m) = Fk(m). Although we expect that this “reveals no information about m” (since, if f is a uniform function, then f (m) is simply a uniform n-bit string), this method of encryption is de-terministic and so cannot possibly be CPA-secure. In particular, encrypting the same plaintext twice will yield the same ciphertext.

Our secure construction is randomized. Specifically, we encrypt by applying the pseudorandom function to a random value r (rather than the message) and XORing the result with the plaintext. (See Figure 3.3 and Construction 3.30.) This can again be viewed as an instance of XORing a pseudorandom pad with the plaintext, with the major difference being the fact that a fresh pseudoran-dom pad is used each time. (In fact, the pseudoranpseudoran-dom pad is only “fresh” if the pseudorandom function is applied to a “fresh” value on which it has never been applied before. While it is possible that a random r will be equal to some r-value chosen previously, this happens with only negligible probability.)

Ciphertext pad

XOR Plaintext

Random string r

Pseudorandom function

FIGURE 3.3: Encryption with a pseudorandom function.

Proofs of security based on pseudorandom functions. Before turning to the proof that the above construction is CPA-secure, we highlight a common template that is used by most proofs of security (even outside the context of encryption) for constructions based on pseudorandom functions. The first step of such proofs is to consider a hypothetical version of the construction in which the pseudorandom function is replaced with a random function. It is then argued—using a proof by reduction—that this modification does not significantly affect the attacker’s success probability. We are then left with analyzing a scheme that uses a completely random function. At this point the rest of the proof typically relies on probabilistic analysis and does not rely on any computational assumptions. We will utilize this proof template several times in this and the next chapter.

CONSTRUCTION 3.30

Let F be a pseudorandom function. Define a private-key encryption scheme for messages of length n as follows:

• Gen: on input 1n, choose uniform k∈ {0, 1}nand output it.

• Enc: on input a key k ∈ {0, 1}nand a message m∈ {0, 1}n, choose uniform r∈ {0, 1}nand output the ciphertext

c :=hr, Fk(r)⊕ mi.

• Dec: on input a key k ∈ {0, 1}nand a ciphertext c =hr, si, output the plaintext message

m := Fk(r)⊕ s.

A CPA-secure encryption scheme from any pseudorandom function.

THEOREM 3.31 If F is a pseudorandom function, then Construction 3.30 is a CPA-secure private-key encryption scheme for messages of length n.

PROOF Let eΠ = (gGen, gEnc, gDec) be an encryption scheme that is exactly the same as Π = (Gen, Enc, Dec) from Construction 3.30, except that a truly random function f is used in place of Fk. That is, gGen(1n) chooses a uniform function f ∈ Funcn, and gEnc encrypts just like Enc except that f is used instead of Fk. (This modified encryption scheme is not efficient. But we can still define it as a hypothetical encryption scheme for the sake of the proof.)

Fix an arbitrary ppt adversaryA, and let q(n) be an upper bound on the number of queries thatA(1n) makes to its encryption oracle. (Note that q must be upper-bounded by some polynomial.) As the first step of the proof, we show that there is a negligible function negl such that

Prh

PrivKcpaA,Π(n) = 1i

− Prh

PrivKcpa

A,eΠ(n) = 1i

≤ negl(n). (3.8)

We prove this by reduction. We use A to construct a distinguisher D for the pseudorandom function F . The distinguisher D is given oracle access to some function O, and its goal is to determine whether this function is

“pseudorandom” (i.e., equal to Fk for uniform k∈ {0, 1}n) or “random” (i.e., equal to f for uniform f ∈ Funcn). To do this, D emulates experiment PrivKcpa forA in the manner described below, and observes whether A succeeds or not.

IfA succeeds then D guesses that its oracle must be a pseudorandom function, whereas ifA does not succeed then D guesses that its oracle must be a random function. In detail:

Distinguisher D:

D is given input 1n and access to an oracleO : {0, 1}n→ {0, 1}n. 1. Run A(1n). Whenever A queries its encryption oracle on a message m∈ {0, 1}n, answer this query in the following way:

(a) Choose uniform r∈ {0, 1}n. (b) QueryO(r) and obtain response y.

(c) Return the ciphertexthr, y ⊕ mi to A.

2. When A outputs messages m0, m1 ∈ {0, 1}n, choose a uni-form bit b∈ {0, 1} and then:

(a) Choose uniform r∈ {0, 1}n. (b) QueryO(r) and obtain response y.

(c) Return the challenge ciphertexthr, y ⊕ mbi to A.

3. Continue answering encryption-oracle queries ofA as before untilA outputs a bit b0. Output 1 if b0= b, and 0 otherwise.

D runs in polynomial time sinceA does. The key points are as follows:

1. If D’s oracle is a pseudorandom function, then the view of A when run as a subroutine by D is distributed identically to the view of A in experiment PrivKcpaA,Π(n). This is because, in this case, a key k is chosen uniformly at random and then every encryption is carried out by choosing a uniform r, computing y := Fk(r), and setting the ciphertext equal tohr, y ⊕ mi, exactly as in Construction 3.30. Thus,

Prk←{0,1}nh

DFk(·)(1n) = 1i

= Prh

PrivKcpaA,Π(n) = 1i

, (3.9) where we emphasize that k is chosen uniformly on the left-hand side.

2. If D’s oracle is a random function, then the view of A when run as a subroutine by D is distributed identically to the view ofA in experiment PrivKcpa

A,eΠ(n). This can be seen exactly as above, with the only difference being that a uniform function f∈ Funcn is used instead of Fk. Thus,

Prf ←Funcnh where f is chosen uniformly from Funcn on the left-hand side.

By the assumption that F is a pseudorandom function (and since D is effi-cient), there exists a negligible function negl for which

Prh

DFk(·)(1n) = 1i

− Prh

Df (·)(1n) = 1i ≤ negl(n).

Combining the above with Equations (3.9) and (3.10) gives Equation (3.8).

For the second part of the proof, we show that Prh

(Recall that q(n) is a bound on the number of encryption queries made byA.

The above holds even if we place no computational restrictions onA.) To see that Equation (3.11) holds, observe that every time a message m is encrypted in PrivKcpa

A,eΠ(n) (either by the encryption oracle or when the challenge cipher-text is computed), a uniform r ∈ {0, 1}n is chosen and the ciphertext is set equal tohr, f(r) ⊕ mi. Let rdenote the random string used when generating the challenge ciphertexthr, f (r)⊕ mbi. There are two possibilities:

1. The value r is never used when answering any ofA’s encryption-oracle queries: In this case,A learns nothing about f(r) from its interaction with the encryption oracle (since f is a truly random function). This means that, as far as A is concerned, the value f(r) that is XORed with mb is uniformly distributed and independent of the rest of the experiment, and so the probability thatA outputs b0= b in this case is

1. The value r is never used when answering any ofA’s encryption-oracle queries: In this case,A learns nothing about f(r) from its interaction with the encryption oracle (since f is a truly random function). This means that, as far as A is concerned, the value f(r) that is XORed with mb is uniformly distributed and independent of the rest of the experiment, and so the probability thatA outputs b0= b in this case is