Diffie-Hellman Cryptanalysis - : Logjam attack and measurements

CHAPTER 6 : Logjam attack and measurements

6.2 Diffie-Hellman Cryptanalysis

Diffie-Hellman key exchange was the first published public-key algorithm [108]. In the simple case of prime groups, Alice and Bob agree on a prime p and a generator g of a multiplicative subgroup modulo p. Alice sends ga_mod_p_{, Bob sends} _gb _mod_p_{, and each} computes a shared secret gab _mod_p_{. While there is also a Diffie-Hellman exchange over} elliptic curve groups, we address only the “modp” case.

The security of Diffie-Hellman is not known to be equivalent to the discrete log problem (ex- cept in certain groups[105, 213, 214]), but computing discrete logs remains the best known cryptanalytic attack. An attacker who can find the discrete log x from y =gx_mod_p _can easily find the shared secret.

Textbook descriptions of discrete log can be misleading about the computational tradeoffs, for example by balancing parameters to minimize overall time to compute asingle discrete

log. In fact, as illustrated in Figure 11, a single large precomputation onp can be used to efficiently break all Diffie-Hellman exchanges made with that prime.

The typical case. Diffie-Hellman is typically implemented with prime fields and large group orders. In this case, the most efficient discrete log algorithm is the number field sieve (NFS) [154, 183, 276].1 There is a closely related number field sieve algorithm for factoring [101, 202], and in fact many parts of the implementations can be shared. The general technique is called index calculus and has four stages with different computational properties. The first three steps are only dependent on the prime p and comprise most of the computation.

First is polynomial selection, in which one finds a polynomial f(z) defining a number field

Q(z)/f(z) for the computation. (For our cases, f(z) typically has degree 5 or 6.) This parallelizes well and is only a small portion of the runtime.

In the second stage, sieving, one factors ranges of integers and number field elements in

batches to find many relations of elements, all of whose prime factors are less than some bound B (called B-smooth). Modern implementations use special-q lattice sieving, which

for each specialqexplores a sieving region of 22I candidates, whereI is a parameter. Sieving parallelizes well since each specialq is handled independently of the others, but is computationally expensive, because we must search through and attempt to factor many elements. The time for this step depends on heuristic estimates of the probability of encountering B-smooth numbers in this search; it also depends on I and on the number of special q to consider before having enough relations.

In the third stage, linear algebra, we construct a large, sparse matrix consisting of the

coefficient vectors of prime factorizations we have found. A nonzero kernel vector of the matrix modulo the order q of the group will give us logs of many small elements. This

1_{Recent spectacular advances in discrete log algorithms have resulted in a quasi-polynomial algorithm}

for small-characteristic fields [47], but these advances are not known to apply to the prime fields used in practice.

database of logs serves as input to the final stage. The difficulty depends on q and the matrix size and can be parallelized in a limited fashion.

The final stage,descent, actually deduces the discrete log of the targety. We re-sieve until

we can find a set of relations that allow us to write the log of y in terms of the logs in the precomputed database. This step is accomplished in three phases: an initialization phase, which tries to write the target in terms of medium-sized primes, a middle phase, in which these medium-sized primes are further sieved until they can be represented by elements in the database of known logs, and a final phase that actually reconstructs the target using the log database. Crucially, descent is the only NFS stage that involvesy (org), so polynomial selection, sieving, and linear algebra can be done once for a primep and reused to compute the discrete logs of many targets.

The running time of this algorithm is

Lp(1/3,(64/9)1/3) = exp

(1.923 +o(1))(logp)1/3(log logp)2/3.

This is obtained by tuning many parameters, including the degree off, the sieving region parameter I, and, most importantly, the smoothness bound B. Early articles (e.g. [154]) encountered technical difficulties with descent and reported that the complexity of this step would equal that of the precomputation; this may have contributed to misconceptions about the performance of the NFS for discrete logs. More recent analyses have improved the complexity of descent to Lp(1/3,1.442) [96], and later to Lp(1/3,1.232) [46], which is much cheaper than the precomputation in practice.

The numerous parameters of the algorithm allow some flexibility to reduce time on some computational steps at the expense of others. For example, sieving more will result in a smaller matrix, making linear algebra cheaper, and doing more work in the precomputation makes the final descent step easier. In Section 6.3.3, we show how exploiting these tradeoffs allows us to quickly compute 512-bit discrete logs in order to perform an effective man-in-

the-middle attack on TLS.

Improperly generated groups. A different family of algorithms runs in time exponential in group order, and they are practical even for large primes when the group order is small or has many small prime factors. To avoid this, most implementations use “safe” primes, which have the property that p₋1 = 2q for some prime q, so that the only possible subgroups have order 2,q, or 2q. However, as we show in Section 6.3.5, improperly generated groups are sometimes used in practice and susceptible to attack.

The baby-step giant-step [279] and Pollard rho [261] algorithms both take √q time to compute a discrete log in any (sub)group of order q, while Pollard lambda [261] can find x < tin time√t. These parallelize well [299], and precomputation can speed up individual log calculations. If the factorization of the subgroup order q is known, one can use any of the above algorithms to compute the discrete log in each subgroup of order qei

i dividing

q, and then recover x using the Chinese remainder theorem. This is the Pohlig-Hellman algorithm [259], which costsP

iei

√_q

i using baby-step giant-step or Pollard rho.

Standard primes. Generating primes with special properties can be computationally burdensome, so many implementations use fixed or standardized Diffie-Hellman parameters. A prominent example is the Oakley groups [249], which give “safe” primes of length 768 (Oakley Group 1), 1024 (Oakley Group 2), and 1536 (Oakley Group 5). These groups were published in 1998 and have been used for many applications since, including IKE, SSH, Tor, and OTR.

When primes are of sufficient strength, there seems to be no disadvantage to reusing them. However, widespread reuse of Diffie-Hellman groups can convert attacks that are at the limits of an adversary’s capabilities into devastating breaks, since it allows the attacker to amortize the cost of discrete log precomputation among vast numbers of potential targets.

Source Popularity Prime Apache 82% 9fdb8b8a004544f0045f1737d0ba2e0b 274cdf1a9f588218fb435316a16e3741 71fd19d8d8f37c39bf863fd60e3e3006 80a3030c6e4c3757d08f70e6aa871033 mod ssl 10% d4bcd52406f69b35994b88de5db89682 c8157f62d8f33633ee5772f11f05ab22 d6b5145b9f241e5acc31ff090a4bc711 48976f76795094e71e7903529f5a824b

(others) 8% (463 distinct primes)

Table 31: Top 512-bit DH primes for TLS—8.4% of Alexa Top 1M HTTPS domains allowDHE EXPORT, of which 92.3% use one of the two most popular primes, shown here.

In document Measuring And Securing Cryptographic Deployments (Page 165-169)