Lecture 13: Message Authentication Codes

(1)

Last modified 2015/02/02

In CCA security, the distinguisher can ask the library to decrypt arbitrary ciphertexts of its choosing.

Now in addition to the ciphertexts honestly generated by the library using the encryption key, there are maliciously crafted ciphertexts also floating around.

Towards achieving CCA security, it would be helpful to provide a mechanism to the decryption algorithm to distinguish honestly-generated ciphertexts from all other ciphertexts. We’re not interested in the problem ofprivacy (hiding information from an adversary); rather we are interested here in the problem ofauthenticity. We want a way to enforce that only someone with the secret key could have generated a particular piece of information (in this case a ciphertext).

The tool for the job is called amessage authentication code, or MAC for short. It consists of two algorithms:

• KeyGen: samples a secret key for the MAC

• MAC(k, m): given the secret key k and a plaintext m ∈ M, output a MAC (sometimes called a “tag”) t. Typically MAC is deterministic.

Confusingly, we use the term MAC to sometimes refer to the scheme itself, and sometimes just the

“tag” associated with the message (i.e., “the MAC of m”).

Think of a MAC as a kind of signature that will be appended to a piece of data. Only someone with the secret key can generate a valid MAC. Our intuitive security requirement is that: an adversary who sees many valid MACs (m, MAC(k, m)) cannot produce aforgery — a valid MAC of a different message.

13.1 Security Definition

This is a totally new kind of security requirement. It has nothing to do with hiding information.

Rather, it is about whether the adversary can actually generate a particular piece of data (e.g., a MAC forgery).

To express such a requirement in a security definition, we need some new tricks in our reper- toire. We want to talk about the inability of an adversary to generate a particular piece of data. So let’s start with a much simpler statement: “no adversary should be able to guess a uniformly chosen value.” We formalize this idea with the following two libraries:

Lleft:

r ← {0, 1}ⁿ

GUESS(g):

if g = r return 1 else return 0

Lright: ^GUESS(g):

return 0

c

(2)

110 LECTURE 13. MESSAGE AUTHENTICATION CODES

The left library allows the calling program to attempt to guess a uniformly chosen string. The right library doesn’t even bother to verify the calling program’s guess — it doesn’t even bother to sample a random target string!

Focus on the difference between these two libraries. Their GUESS subroutines give the same output on nearly all inputs. There is only one input r on which they disagree. If a calling program can manage to find that input r, then it can easily distinguish the libraries. Therefore, if the libraries are indistinguishable, it means that the adversary cannot find/generate one of these special inputs!

That’s the kind of property we want to express.

Indeed, in this case, an adversary who makes q queries to theGUESSsubroutine achieves a bias of at most q/2ⁿ. For polynomial-time adversaries, this is a negligible bias.

The MAC security definition. Let’s follow the pattern from the simple example above. We want to say that no adversary can generate a MAC forgery. So our libraries will provide a mechanism to let the adversary check whether it has a forgery. One library will actually perform the check, and the other library will simply say “no.” The two libraries are different only in how they behave when the adversary calls a subroutine on a true forgery. So by demanding that the two libraries be indistinguishable, we are actually demanding that no adversary can generate forgeries (except with negligible probability).

The formal definition is below:

Definition: Let Σ be a MAC scheme, and letL^Σ_mac-real andL^Σ_mac-fakebe defined as below. Then Σ is asecure MAC ifL^Σ_mac-real∼=L^Σ_mac-fake.

L^Σ_mac-real:

k ← Σ.KeyGen

GETMAC(m ∈ Σ.M):

return Σ.MAC(k, m)

VER(m ∈ Σ.M, t):

if t = Σ.MAC(k, m):

return true else return false

L^Σ_mac-fake:

k ← Σ.KeyGen T = ∅

GETMAC(m ∈ Σ.M):

t := Σ.MAC(k, m) T := T ∪ {(m, t)}

return t

VER(m ∈ Σ.M, t):

if (m, t) ∈ T : return true else return false

We do allow the adversary to request MACs of chosen messages, hence theGETMACsubroutine.

This means that the adversary is always capable of obtaining valid MACs. However, MACs that were generated byGETMACdon’t count as forgeries — only a MAC of a different message would be a forgery.

For this reason, theLmac-fakelibrary keeps track of which MACs were generated by GETMAC, in the set T . It is clear that these MACs will always be judged valid byVER. But for all other inputs to

VER,Lmac-fakesimply answers false.

TODO: discuss: adversary wins by doing any forgery – we don’t care whether it’s a forgery on some particular (meaningful) message

(3)

13.2 A PRF is a MAC

The definition of a PRF says (more or less) that even if you’ve seen the output of the PRF on several chosen inputs, all other outputs look independently uniformly random. And, as we saw in our earlier example, uniformly random values are hard to guess. Putting these two observations together, and we’re awfully close to to the MAC definition. Indeed, a PRF is a secure MAC.

There is one subtlety worth pointing out. An adversary might make call VER on (m, t) trying to produce a forgery, and then later call GETMAC(m) (on the same m) to learn the “true” value of MAC(k, m). This idea is not captured by our simple guess-the-random number libraries from above.

To make that example more relevant, we have to extend it as follows:

Lleft:

r ← {0, 1}ⁿ

GUESS(g):

PEEK():

return r

Lright:

r :=undefined

GUESS(g):

if r undefined:

return 0 else:

PEEK() : r ← {0, 1}ⁿ return r

Now the libraries allow the calling program to actually see the secret value r via the PEEKsubrou- tine. Intuitively, if you haven’t yet PEEK’ed, then it should be hard to guess the secret value. The Lrightlibrary waits until the adversaryPEEK s to actually sample r. Before that time,GUESSsimply answers no.

TODO: expand, give more of a formal proof

Claim: The scheme MAC(k, m) = F (k, m) is a secure MAC, when F is a secure pseudorandom function.

13.3 CBC-MAC

A PRF typically supports only messages of a fixed length, but we will soon see that it’s useful to have a MAC that supports longer messages. A classical approach to extend the input size of a MAC involves the CBC block cipher mode applied to a PRF. The result is shown in Figure13.1.

CBC-MAC differs from CBC encryption mode in two important ways: First, there is no initialization vector. Indeed, CBC-MAC is deterministic (you can think of it as CBC encryption but with the initialization vector fixed to all zeroes). Second, CBC-MAC outputs only the last block.

Claim: If F is a secure PRF, then for every fixed `, CBC-MAC is a secure MAC for message space M = {0, 1}^n`.

TODO: The proof of this claim is slightly beyond the scope of these notes.

Maybe I will eventually find a way to incorporate it.

(4)

CBCMAC^F(k, m1· · · m_`):

t := 0ⁿ: for i = 1 to `:

t := F (k, m_i⊕ t) return t

Fk Fk Fk

⊕ ⊕

m1 m2 m_`

t

· · ·

Figure 13.1. CBC-MAC

Note that we have restricted the message space to messages of exactly ` blocks. Unlike CBC encryption, CBC-MAC isnot suitable for messages of variable lengths. If the adversary is allowed to request the CBC-MAC of messages of different lengths, then it is possible for the adversary to generate a forgery (see the homework).

13.4 Encrypt-Then-MAC

It is possible to achieve CCA security in a modular way, combining any CPA-secure encryption scheme with a secure MAC. One must be careful, though, to combine these two components in the right way!

Let E denote an encryption scheme, and M denote a MAC scheme. Then let EtM denote the encrypt-then-MAC construction given below:

EtM.KeyGen:

ke← E.KeyGen km← M.KeyGen return (k_e, k_m)

EtM.Enc((ke, km), m):

c ← E.Enc(k_e, m) t := M.MAC(km, c) return (c, t)

EtM.Dec((ke, km), (c, t)):

if t 6= M.MAC(k_m, c):

returnerr return E.Dec(ke, c)

In the Dec algorithm, we are interpreting “err” as a special symbol that will never be confused with a valid plaintext.

Claim: Encrypt-then-MAC has CCA security if its underlying encryption scheme has CPA security and its underlying MAC scheme is a secure MAC (with message space of the MAC containing the ciphertext space of the encryption scheme).

Proof: As usual, we start with an arbitrary adversary A linked to L^EtM_cca-L. In this interaction, the

CHALLENGE subroutine will encrypt mL under E.Enc. Our goal is to therefore find a way to apply the CPA security of E to switch mLwith mR. What is preventing us from doing just that?

The difficulty arises in the fact that E.Dec appears in the code ofLcca-L. If we were to factor out the E-encryption statements in terms ofLÊ_cpa-L, we would also have to make k local to the LÊ_cpa-L library. Then there would be no way to carry out the E.Dec that remain in Lcca-L, since LÊ_cpa-L provides no way to perform decryptions using the (local) secret key k.

However, our use of a MAC provides a path to writing the interaction in terms ofLcpa-L. After applying the security of the MAC, we will obtain an interaction in which the E.Dec statement is un- reachable. Hence, it can be safely removed! What remains is an interaction involving only E.KeyGen and E.Enc, which can be written in terms ofLcpa-L. At that point, we can switch E.Enc(k, m_L)with E.Enc(k, mR)and the rest of the proof follows as a mirror image.

(5)

A

ke← E.KeyGen k_m← M.KeyGen S := ∅

CHALLENGE(m_L, m_R):

if |mL| 6= |m_R| return null c ← E.Enc(k_e, m_L) t ← M.MAC(km, c) S := S ∪ { (c, t) } return (c, t)

DEC(c, t):

if (c, t) ∈ S return null if t 6= M.MAC(km, c):

A

ke← E.KeyGen S := ∅

CHALLENGE(mL, mR):

if |mL| 6= |m_R| return null c ← E.Enc(ke, m_L) t := GETMAC(c) S := S ∪ {(c, t)}

return (c, t)

DEC(c, t):

if (c, t) ∈ S return null if notVER(c, t):

k_m ← M.KeyGen

GETMAC(c):

return M.MAC(km, c)

VER(c, t):

if t = M.MAC(km, c):

return true else return false

(a) (b)

A

ke← E.KeyGen S := ∅

CHALLENGE(mL, mR):

if |m_L| 6= |m_R| return null c ← E.Enc(ke, mL) t :=GETMAC(c) S := S ∪ {(c, t)}

return (c, t)

DEC(c, t):

if (c, t) ∈ S return null if notVER(c, t):

returnerr return E.Dec(k_e, c)

km← M.KeyGen T = ∅

GETMAC(c):

t := M.MAC(km, c) T := T ∪ {(c, t)}

return t

VER(c, t):

if (c, t) ∈ T : return true else return false

A

k_e← E.KeyGen km ← M.KeyGen S := ∅

CHALLENGE(mL, mR):

if |mL| 6= |m_R| return null c ← E.Enc(ke, m_L) t := M.MAC(k_m, c) S := S ∪ {(c, t)}

return (c, t)

DEC(c, t):

if (c, t) ∈ S return null if (c, t) 6∈ S :

(c) (d)

Figure 13.2. Some steps in the CCA-security proof of encrypt-then-MAC.

(6)

A

k_m← M.KeyGen S := ∅

CHALLENGE(m_L, m_R):

if |m_L| 6= |m_R| return null

c := CHALLENGE⁰(m_L, m_R) t := M.MAC(k_m, c)

S := S ∪ {(c, t)}

return (c, t)

DEC(c, t):

if (c, t) ∈ S return null if (c, t) 6∈ S:

returnerr

ke← E.KeyGen

CHALLENGE⁰(mL, mR):

c := E.Enc(ke, m_L) return c

(e)

Figure 13.3. Some steps in the CCA-security proof of encrypt-then-MAC, continued.

The formal steps of the proof are described in Figures13.2&13.3. In diagram (a), the details of the EtM construction have been filled in for A L^EtM_cca-L. Justification for each of the steps is given below:

(a)⇒(b) We factor out all of the statements involving the MAC scheme. No effect on the program’s output distribution.

(b)⇒(c) In (b) we have some program linked withL^M_mac-real. By the MAC security of scheme M we can replace that library with L^M_mac-fake to obtain (c). The program’s output distribution changes by at most a negligible amount.

(c)⇒(d) We inline subroutines, with no effect on the output distribution. We have simplified variables to some extent, since sets S and T are always identical.

(d)⇒(e) In program (d) the final statement of theDECsubroutine is unreachable! In other words, the Dec algorithm of the CPA-secure scheme E is never used. Now it is possible to factor out all of the statements involving the CPA-secure encryption scheme E in terms of libraryL^M_cpa-L. No effect on the program’s distribution.

With program (e) we are stopping just before the half-way point of the proof. It should be clear that we can now replace Lcpa-L withLcpa-R to begin obtaining encryptions of mR. Then reversing the steps (a)⇒(b)⇒(c)⇒(d) in the normal way, we will eventually arrive at A L^EtM_cca-Ras desired.

(7)

13.5 Problems

1. (B&R) Consider the following MAC scheme, where F is a secure PRF:

KeyGen:

k ← {0, 1}ⁿ return k

MAC(k, m₁· · · m_`):

// each miis n bits t := 0ⁿ

for i = 1 to `:

t := t ⊕ F (k, m_i) return t

Show that the scheme isnot a secure MAC. Describe a distinguisher and compute its bias.

2. (B&R) Consider the following MAC scheme, where F is a secure PRF:

KeyGen:

k ← {0, 1}ⁿ return k

MAC(k, m₁· · · m_`):

// each mi is n bits t := 0ⁿ

for i = 1 to `:

t := t ⊕ F (k, ikmi) return t

In the above, F (k, ikmi) denotes calling the PRF on an input consisting of the number i encoded in binary, followed by block m_i. Assume that F supports inputs of arbitrary length.

Show that the scheme isnot a secure MAC. Describe a distinguisher and compute its bias.

3. Suppose we expand the message space of CBC-MAC to M = ({0, 1}ⁿ)^∗. In other words, the adversary can request a MAC on a message consisting of any number of blocks. Show that the result is not a secure MAC. Construct a distinguisher and compute its bias.

Hint: Request a MAC on two single-block messages, then use the result to forge the MAC of a two-block message.

?4. CBC-MAC is similar to CBC encryption, except that it outputs the last block only. For this problem, define a new variant of CBC encryption which outputs theXOR of all of the blocks.

NEWMAC^F(k, m₁· · · m_`):

t := 0ⁿ: for i = 1 to `:

t := t ⊕ F (k, mi⊕ t) return t

(a) IsNEWMACa secure MAC for message space M = ({0, 1}ⁿ)^∗?

(b) Is NEWMACa secure MAC for message space M = {0, 1}^n`(with fixed `)?

5. Let E be a CPA-secure encryption scheme and M be a secure MAC. Show that the following encryption scheme (encrypt & MAC) isnot CCA-secure:

(8)

E&M.KeyGen:

ke← E.KeyGen km ← M.KeyGen return (ke, k_m)

E&M.Enc((ke, km), m):

c ← E.Enc(k_e, m) t := M.MAC(km, m) return (c, t)

E&M.Dec((k_e, k_m), (c, t)):

m := E.Dec(ke, c) if t 6= M.MAC(k_m, m):

returnerr return m Describe a distinguisher and compute its bias.