Public-Key Cryptanalysis 1:
Introduction and Factoring
Nadia Heninger University of Pennsylvania
July 21, 2013
Adventures in Cryptanalysis
Part 1: Introduction and Factoring.
What is public-key crypto really based on?
How to break RSA from the comfort of your very own home.
Part 2: Generating keys.
How not to generate your crypto keys.
And how many people who should know better have screwed it up.
Part 3: Breaking keys with lattices.
How fragile is RSA?
And why not to post your private keys to Pastebin.
Textbook Diffie-Hellman
[Diffie Hellman 1976]
Public Parameters
G a group (e.g. Fp, or an elliptic curve) g group generator
Key Exchange ga
gb
gab gab
Computational problems
Discrete Log
Problem: Given ga, compute a.
I Solving this problem permits attacker to compute shared key by computing a and raising (gb)a.
I Discrete log is in NP and coNP → not NP-complete (unless P=NP or similar).
Computational problems
Diffie-Hellman problem
Problem: Given ga, gb, compute gab.
I Exactly problem of computing shared key from public information.
I Reduces to discrete log in some cases:
I “Diffie-Hellman is as strong as discrete log for certain primes”
[den Boer 1988] “both problems are (probabilistically) polynomial-time equivalent if the totient of p − 1 has only small prime factors”
I “Towards the equivalence of breaking the Diffie-Hellman protocol and computing discrete logarithms” [Maurer 1994] “if . . . an elliptic curve with smooth order can be construted efficiently, then . . . [the discrete log] can be reduced efficiently to breakingthe Diffie-Hellman protocol”
I (Computational) Diffie-Hellman assumption: This problem is hard in general.
Textbook RSA
[Rivest Shamir Adleman 1977]
Public Key N = pq modulus
e encryption exponent
Private Key p, q primes
d decryption exponent
(d = e−1mod (p − 1)(q − 1))
Encryption public key = (N, e) ciphertext = messagee mod N
message = ciphertextd mod N
Textbook RSA
[Rivest Shamir Adleman 1977]
Public Key N = pq modulus
e encryption exponent
Private Key p, q primes
d decryption exponent
(d = e−1mod (p − 1)(q − 1))
Signing public key = (N, e) signature = messaged mod N
message = signaturee mod N
Computational problems
Factoring
Problem: Given N, compute its prime factors.
I Computationally equivalent to computing private key d .
I Factoring is in NP and coNP → not NP-complete (unless P=NP or similar).
Computational problems
eth roots mod N
Problem: Given N, e, and c, compute x such that xe ≡ c mod N.
I Equivalent to decrypting an RSA-encrypted ciphertext.
I Equivalent to selective forgery of RSA signatures.
I Conflicting results about whether it reduces to factoring:
I “Breaking RSA may not be equivalent to factoring” [Boneh Venkatesan 1998]
“an algebraic reduction from factoring to breaking
low-exponent RSA can be converted into an efficient factoring algorithm”
I “Breaking RSA generically is equivalent to factoring”
[Aggarwal Maurer 2009]
“a generic ring algorithm for breaking RSA in ZN can be converted into an algorithm for factoring”
I “RSA assumption”: This problem is hard.
Public service announcement: Textbook RSA is insecure
RSA encryption is homomorphic under multiplication. This lets an attacker do all sorts of fun things:
Attack: Malleability
Given a ciphertext c = me mod N, cae mod N is an encryption of ma for any a chosen by attacker.
Attack: Chosen ciphertext attack
Given a ciphertext c, attacker asks for decryption of cae mod N and divides by a to obtain m.
Attack: Signature forgery
Attacker convinces a signer to sign z = xye mod N and computes a valid signature of x as zd/y mod N.
So in practice we always use padding on messages.
The DSA Algorithm
DSA Public Key p prime
q prime, divides (p − 1) g generator of subgroup of
order q mod p y = gx mod p
Verify
u1= H(m)s−1mod q u2= rs−1mod q
r = g? u1yu2 mod p mod q
Private Key x private key
Sign
Generate random k.
r = gk mod p mod q s = k−1(H(m) + xr ) mod q
Computational problems
Discrete Log
I Breaking DSA is equivalent to computing discrete logs in the random oracle model. [Pointcheval, Vaudenay 96]
http://xkcd.com/538/
The rest of this lecture.
Not ethical research.
The other two lectures.
Factoring (and discrete log) the hard way
Some material borrowed from Dan Bernstein and Tanja Lange
Preliminaries: Using Sage
Working code examples will be given in Sage.
Sage is free open source mathematics software.
Download from http://www.sagemath.org/.
Sage is based on Python sage: 2*3
6
ˆ is exponentiation, not xor
It has lots of useful libraries: sage: factor(15)
3 * 5
sage: factor(x^2-1) (x - 1) * (x + 1)
Preliminaries: Using Sage
Working code examples will be given in Sage.
Sage is free open source mathematics software.
Download from http://www.sagemath.org/.
Sage is based on Python, but there are a few differences:
sage: 2^3 8
ˆ is exponentiation, not xor
It has lots of useful libraries: sage: factor(15)
3 * 5
sage: factor(x^2-1) (x - 1) * (x + 1)
Preliminaries: Using Sage
Working code examples will be given in Sage.
Sage is free open source mathematics software.
Download from http://www.sagemath.org/.
Sage is based on Python, but there are a few differences:
sage: 2^3 8
ˆ is exponentiation, not xor
It has lots of useful libraries:
sage: factor(15) 3 * 5
sage: factor(x^2-1) (x - 1) * (x + 1)
Preliminaries: Using Sage
Working code examples will be given in Sage.
Sage is free open source mathematics software.
Download from http://www.sagemath.org/.
Sage is based on Python, but there are a few differences:
sage: 2^3 8
ˆ is exponentiation, not xor
It has lots of useful libraries:
sage: factor(15) 3 * 5
sage: factor(x^2-1) (x - 1) * (x + 1)
Practicing Sage and Textbook RSA
Key generation:
sage: p = random_prime(2^512); q = random_prime(2^512) sage: N = p*q
sage: e = 65537
sage: d = inverse_mod(e,(p-1)*(q-1)) Encryption:
sage: m = Integer(’helloworld’,base=35) sage: c = pow(m,65537,N)
Decryption:
sage: Integer(pow(c,d,N)).str(base=35)
’helloworld’
So how hard is factoring?
sage: time factor(random_prime(2^32)*random_prime(2^32)) 170795249 * 1091258383
Time: CPU 0.01 s, Wall: 0.01 s
sage: time factor(random_prime(2^64)*random_prime(2^64)) 4711473922727062493 * 14104094416937800129
Time: CPU 0.13 s, Wall: 0.15 s
sage: time factor(random_prime(2^96)*random_prime(2^96)) 4602215373378843555620133613 * 33422266165499970562761950677 Time: CPU 4.64 s, Wall: 4.76 s
sage: time factor(random_prime(2^128)*random_prime(2^128))
249431539076558964376759054734817465081 * 297857583322428928385090659777642423243 Time: CPU 506.95 s, Wall: 507.41 s
So how hard is factoring?
sage: time factor(random_prime(2^32)*random_prime(2^32))
170795249 * 1091258383
Time: CPU 0.01 s, Wall: 0.01 s
sage: time factor(random_prime(2^64)*random_prime(2^64)) 4711473922727062493 * 14104094416937800129
Time: CPU 0.13 s, Wall: 0.15 s
sage: time factor(random_prime(2^96)*random_prime(2^96)) 4602215373378843555620133613 * 33422266165499970562761950677 Time: CPU 4.64 s, Wall: 4.76 s
sage: time factor(random_prime(2^128)*random_prime(2^128))
249431539076558964376759054734817465081 * 297857583322428928385090659777642423243 Time: CPU 506.95 s, Wall: 507.41 s
So how hard is factoring?
sage: time factor(random_prime(2^32)*random_prime(2^32)) 170795249 * 1091258383
Time: CPU 0.01 s, Wall: 0.01 s
sage: time factor(random_prime(2^64)*random_prime(2^64)) 4711473922727062493 * 14104094416937800129
Time: CPU 0.13 s, Wall: 0.15 s
sage: time factor(random_prime(2^96)*random_prime(2^96)) 4602215373378843555620133613 * 33422266165499970562761950677 Time: CPU 4.64 s, Wall: 4.76 s
sage: time factor(random_prime(2^128)*random_prime(2^128))
249431539076558964376759054734817465081 * 297857583322428928385090659777642423243 Time: CPU 506.95 s, Wall: 507.41 s
So how hard is factoring?
sage: time factor(random_prime(2^32)*random_prime(2^32)) 170795249 * 1091258383
Time: CPU 0.01 s, Wall: 0.01 s
sage: time factor(random_prime(2^64)*random_prime(2^64))
4711473922727062493 * 14104094416937800129 Time: CPU 0.13 s, Wall: 0.15 s
sage: time factor(random_prime(2^96)*random_prime(2^96)) 4602215373378843555620133613 * 33422266165499970562761950677 Time: CPU 4.64 s, Wall: 4.76 s
sage: time factor(random_prime(2^128)*random_prime(2^128))
249431539076558964376759054734817465081 * 297857583322428928385090659777642423243 Time: CPU 506.95 s, Wall: 507.41 s
So how hard is factoring?
sage: time factor(random_prime(2^32)*random_prime(2^32)) 170795249 * 1091258383
Time: CPU 0.01 s, Wall: 0.01 s
sage: time factor(random_prime(2^64)*random_prime(2^64)) 4711473922727062493 * 14104094416937800129
Time: CPU 0.13 s, Wall: 0.15 s
sage: time factor(random_prime(2^96)*random_prime(2^96)) 4602215373378843555620133613 * 33422266165499970562761950677 Time: CPU 4.64 s, Wall: 4.76 s
sage: time factor(random_prime(2^128)*random_prime(2^128))
249431539076558964376759054734817465081 * 297857583322428928385090659777642423243 Time: CPU 506.95 s, Wall: 507.41 s
So how hard is factoring?
sage: time factor(random_prime(2^32)*random_prime(2^32)) 170795249 * 1091258383
Time: CPU 0.01 s, Wall: 0.01 s
sage: time factor(random_prime(2^64)*random_prime(2^64)) 4711473922727062493 * 14104094416937800129
Time: CPU 0.13 s, Wall: 0.15 s
sage: time factor(random_prime(2^96)*random_prime(2^96))
4602215373378843555620133613 * 33422266165499970562761950677 Time: CPU 4.64 s, Wall: 4.76 s
sage: time factor(random_prime(2^128)*random_prime(2^128))
249431539076558964376759054734817465081 * 297857583322428928385090659777642423243 Time: CPU 506.95 s, Wall: 507.41 s
So how hard is factoring?
sage: time factor(random_prime(2^32)*random_prime(2^32)) 170795249 * 1091258383
Time: CPU 0.01 s, Wall: 0.01 s
sage: time factor(random_prime(2^64)*random_prime(2^64)) 4711473922727062493 * 14104094416937800129
Time: CPU 0.13 s, Wall: 0.15 s
sage: time factor(random_prime(2^96)*random_prime(2^96)) 4602215373378843555620133613 * 33422266165499970562761950677 Time: CPU 4.64 s, Wall: 4.76 s
sage: time factor(random_prime(2^128)*random_prime(2^128))
249431539076558964376759054734817465081 * 297857583322428928385090659777642423243 Time: CPU 506.95 s, Wall: 507.41 s
So how hard is factoring?
sage: time factor(random_prime(2^32)*random_prime(2^32)) 170795249 * 1091258383
Time: CPU 0.01 s, Wall: 0.01 s
sage: time factor(random_prime(2^64)*random_prime(2^64)) 4711473922727062493 * 14104094416937800129
Time: CPU 0.13 s, Wall: 0.15 s
sage: time factor(random_prime(2^96)*random_prime(2^96)) 4602215373378843555620133613 * 33422266165499970562761950677 Time: CPU 4.64 s, Wall: 4.76 s
sage: time factor(random_prime(2^128)*random_prime(2^128))
249431539076558964376759054734817465081 * 297857583322428928385090659777642423243 Time: CPU 506.95 s, Wall: 507.41 s
So how hard is factoring?
sage: time factor(random_prime(2^32)*random_prime(2^32)) 170795249 * 1091258383
Time: CPU 0.01 s, Wall: 0.01 s
sage: time factor(random_prime(2^64)*random_prime(2^64)) 4711473922727062493 * 14104094416937800129
Time: CPU 0.13 s, Wall: 0.15 s
sage: time factor(random_prime(2^96)*random_prime(2^96)) 4602215373378843555620133613 * 33422266165499970562761950677 Time: CPU 4.64 s, Wall: 4.76 s
sage: time factor(random_prime(2^128)*random_prime(2^128))
249431539076558964376759054734817465081 * 297857583322428928385090659777642423243 Time: CPU 506.95 s, Wall: 507.41 s
Factoring in practice
Two kinds of factoring algorithms:
1. Algorithms that have running time dependent on the size of the factor.
I Good for factoring small numbers, and finding small factors of big numbers.
2. General-purpose algorithms that depend on the size of the number to be factored.
I State-of-the-art algorithms for factoring really big numbers.
Trial Division
Good for finding very small factors
Takes p/ log p trial divisions to find a prime factor p.
Pollard rho
Good for finding slightly larger prime factors
Intuition
I Try to take a random walk among elements modN.
I If p divides n, there will be a cycle of length p.
Details
I Expect a collision after searching about√
p random elements.
I To avoid having to store and search every element, use Floyd’s cycle-finding algorithm.
I (Store two pointers, and move one twice as fast as the other until they coincide.)
Pollard rho
Good for finding slightly larger prime factors
Intuition
I Try to take a random walk among elements modN.
I If p divides n, there will be a cycle of length p.
Details
I Expect a collision after searching about√
p random elements.
I To avoid having to store and search every element, use Floyd’s cycle-finding algorithm.
I (Store two pointers, and move one twice as fast as the other until they coincide.)
Why is it the rho algorithm?
Pollard rho impementation
def rho(n):
a = 98357389475943875; c=10 # some random values f = lambda x: (x^2+c)%n
a1 = f(a) ; a2 = f(a1) while gcd(n, a2-a1)==1:
a1 = f(a1); a2 = f(f(a2)) return gcd(n, a2-a1)
sage: N = 698599699288686665490308069057420138223871 sage: rho(N)
2053
Pollard’s p − 1 method
Good for finding special small factors
Intuition
I If ar ≡ 1 mod p then p | gcd(ar − 1, N).
I Don’t know p, pick very smooth number r , hoping for ord(a)p
to divide it. (“Smooth” means has only small prime factors.)
N=44426601460658291157725536008128017297890787 4637194279031281180366057
r=lcm(range(1,2^22)) # this takes a while ... s=Integer(pow(2,r,N))
sage: gcd(s-1,N)
1267650600228229401496703217601
Pollard’s p − 1 method
Good for finding special small factors
Intuition
I If ar ≡ 1 mod p then p | gcd(ar − 1, N).
I Don’t know p, pick very smooth number r , hoping for ord(a)p
to divide it. (“Smooth” means has only small prime factors.)
N=44426601460658291157725536008128017297890787 4637194279031281180366057
r=lcm(range(1,2^22)) # this takes a while ...
s=Integer(pow(2,r,N)) sage: gcd(s-1,N)
1267650600228229401496703217601
Pollard p − 1 method
I This method finds larger factors than the rho method (in the same time)
...but only works for special primes.
In the previous example,
p − 1 = 26· 32· 52· 17 · 227 · 491 · 991 · 36559 · 308129 · 4161791 has only small factors (aka. p − 1 is smooth).
I Many crypto standards require using only “strong primes”
a.k.a primes where p − 1 is not smooth.
I This recommendation is outdated. The elliptic curve method (next slide) works even for “strong” primes.
Lenstra’s Elliptic Curve Method
Good for finding medium-sized factors
Intuition
I Pollard’s p − 1 method works in the multiplicative group of integers modulo p.
I The elliptic curve method is exactly the p − 1 method, but over the group of points on an elliptic curve modulo p:
I Multiplication of group elements becomes addition of points on the curve.
I All arithmetic is still done modulo N.
Theorem (Hasse)
The order of an elliptic curve modulo p is in [p + 1 − 2√
p, p + 1 + 2√ p].
There are lots of smooth numbers in this interval.
If one elliptic curve doesn’t work, try more until you find a smooth order.
Lenstra’s Elliptic Curve Method
Good for finding medium-sized factors
Intuition
I Pollard’s p − 1 method works in the multiplicative group of integers modulo p.
I The elliptic curve method is exactly the p − 1 method, but over the group of points on an elliptic curve modulo p:
I Multiplication of group elements becomes addition of points on the curve.
I All arithmetic is still done modulo N.
Theorem (Hasse)
The order of an elliptic curve modulo p is in [p + 1 − 2√
p, p + 1 + 2√ p].
There are lots of smooth numbers in this interval.
If one elliptic curve doesn’t work, try more until you find a smooth order.
Elliptic Curves
def curve(d):
frac_n = type(d) class P(object):
def __init__(self,x,y):
self.x,self.y = frac_n(x),frac_n(y)
def __add__(a,b):
return P((a.x*b.y + b.x*a.y)/(1 + d*a.x*a.y*b.x*b.y), (a.y*b.y - a.x*b.x)/(1 - d*a.x*b.x*a.y*b.y))
def __mul__(self, m):
return double_and_add(self,m,P(0,1)) ...
Elliptic Curve Factorization
def ecm(n,y,t):
# Choose a curve and a point on the curve.
frac_n = Q(n)
P = curve(frac_n(1,3)) p = P(2,3)
q = p * lcm(xrange(1,y)) return gcd(q.x.t,n)
I Method runs very well on GPUs.
I Still an active research area.
ECM is very efficient at factoring random numbers, once small factors are removed.
Heuristic running time Lp(1/2) = O(ec
√ln p ln ln p).
Quadratic Sieve Intuition: Fermat factorization
Main insight: If we can find two squares a2 and b2 such that a2≡ b2 mod N
Then
a2− b2 = (a + b)(a − b) ≡ 0 mod N
and we might hope that one of a + b or a − b shares a nontrivial common factor with N.
First try:
1. Start at c = d√ Ne
2. Check c2− N, (c + 1)2− N, . . . until we find a square. This is Fermat factorization, which could take up to p steps.
Quadratic Sieve Intuition: Fermat factorization
Main insight: If we can find two squares a2 and b2 such that a2≡ b2 mod N
Then
a2− b2 = (a + b)(a − b) ≡ 0 mod N
and we might hope that one of a + b or a − b shares a nontrivial common factor with N.
First try:
1. Start at c = d√ Ne
2. Check c2− N, (c + 1)2− N, . . . until we find a square.
This is Fermat factorization, which could take up to p steps.
Quadratic Sieve
General-purpose factoring
Intuition
We might not find a square outright, but we can construct a square as a product of numbers we look through.
1. Sieving Try to factor each of c2− N, (c + 1)2− N, . . . 2. Only save a di = ci2− N if all of its prime factors are less than
some bound B. (If it is B-smooth.)
3. Store each di by its exponent vector di = 2e23e3. . . BeB. 4. IfQ
idi is a square, then its exponent vector contains only even entries.
5. Linear Algebra Once enough factorizations have been collected, can use linear algebra to find a linear dependence mod2.
6. Square roots Take square roots and hope for a nontrivial factorization. Math exercise: Square product has 50% chance of factoring pq.
Quadratic Sieve
General-purpose factoring
Intuition
We might not find a square outright, but we can construct a square as a product of numbers we look through.
1. Sieving Try to factor each of c2− N, (c + 1)2− N, . . . 2. Only save a di = ci2− N if all of its prime factors are less than
some bound B. (If it is B-smooth.)
3. Store each di by its exponent vector di = 2e23e3. . . BeB. 4. IfQ
idi is a square, then its exponent vector contains only even entries.
5. Linear Algebra Once enough factorizations have been collected, can use linear algebra to find a linear dependence mod2.
6. Square roots Take square roots and hope for a nontrivial factorization. Math exercise: Square product has 50% chance of factoring pq.
Quadratic Sieve
General-purpose factoring
Intuition
We might not find a square outright, but we can construct a square as a product of numbers we look through.
1. Sieving Try to factor each of c2− N, (c + 1)2− N, . . . 2. Only save a di = ci2− N if all of its prime factors are less than
some bound B. (If it is B-smooth.)
3. Store each di by its exponent vector di = 2e23e3. . . BeB. 4. IfQ
idi is a square, then its exponent vector contains only even entries.
5. Linear Algebra Once enough factorizations have been collected, can use linear algebra to find a linear dependence mod2.
6. Square roots Take square roots and hope for a nontrivial factorization. Math exercise: Square product has 50% chance of factoring pq.
Quadratic Sieve
General-purpose factoring
Intuition
We might not find a square outright, but we can construct a square as a product of numbers we look through.
1. Sieving Try to factor each of c2− N, (c + 1)2− N, . . . 2. Only save a di = ci2− N if all of its prime factors are less than
some bound B. (If it is B-smooth.)
3. Store each di by its exponent vector di = 2e23e3. . . BeB. 4. IfQ
idi is a square, then its exponent vector contains only even entries.
5. Linear Algebra Once enough factorizations have been collected, can use linear algebra to find a linear dependence mod2.
6. Square roots Take square roots and hope for a nontrivial factorization. Math exercise: Square product has 50% chance of factoring pq.
An example of the quadratic sieve
Let’s try to factor N = 2759.
Recall idea: If a2− N is a square b2 then N = (a − b)(a + b). 532− 2759 = 50 = 2 · 52.
542− 2759 = 157. 552− 2759 = 266. 562− 2759 = 377.
572− 2759 = 490 = 2 · 5 · 72. 582− 2759 = 605 = 5 · 112.
The product 50 · 490 · 605 is a square: 22· 54· 72· 112. QS computes gcd{2759, 53 · 57 · 58 −√
50 · 490 · 605} = 31.
An example of the quadratic sieve
Let’s try to factor N = 2759.
Recall idea: If a2− N is a square b2 then N = (a − b)(a + b).
532− 2759 = 50 = 2 · 52. 542− 2759 = 157. 552− 2759 = 266. 562− 2759 = 377.
572− 2759 = 490 = 2 · 5 · 72. 582− 2759 = 605 = 5 · 112.
The product 50 · 490 · 605 is a square: 22· 54· 72· 112. QS computes gcd{2759, 53 · 57 · 58 −√
50 · 490 · 605} = 31.
An example of the quadratic sieve
Let’s try to factor N = 2759.
Recall idea: If a2− N is a square b2 then N = (a − b)(a + b).
532− 2759 = 50 = 2 · 52.
542− 2759 = 157. 552− 2759 = 266. 562− 2759 = 377.
572− 2759 = 490 = 2 · 5 · 72. 582− 2759 = 605 = 5 · 112.
The product 50 · 490 · 605 is a square: 22· 54· 72· 112. QS computes gcd{2759, 53 · 57 · 58 −√
50 · 490 · 605} = 31.
An example of the quadratic sieve
Let’s try to factor N = 2759.
Recall idea: If a2− N is a square b2 then N = (a − b)(a + b).
532− 2759 = 50 = 2 · 52. 542− 2759 = 157.
552− 2759 = 266. 562− 2759 = 377.
572− 2759 = 490 = 2 · 5 · 72. 582− 2759 = 605 = 5 · 112.
The product 50 · 490 · 605 is a square: 22· 54· 72· 112. QS computes gcd{2759, 53 · 57 · 58 −√
50 · 490 · 605} = 31.
An example of the quadratic sieve
Let’s try to factor N = 2759.
Recall idea: If a2− N is a square b2 then N = (a − b)(a + b).
532− 2759 = 50 = 2 · 52. 542− 2759 = 157.
552− 2759 = 266.
562− 2759 = 377.
572− 2759 = 490 = 2 · 5 · 72. 582− 2759 = 605 = 5 · 112.
The product 50 · 490 · 605 is a square: 22· 54· 72· 112. QS computes gcd{2759, 53 · 57 · 58 −√
50 · 490 · 605} = 31.
An example of the quadratic sieve
Let’s try to factor N = 2759.
Recall idea: If a2− N is a square b2 then N = (a − b)(a + b).
532− 2759 = 50 = 2 · 52. 542− 2759 = 157.
552− 2759 = 266.
562− 2759 = 377.
572− 2759 = 490 = 2 · 5 · 72. 582− 2759 = 605 = 5 · 112.
The product 50 · 490 · 605 is a square: 22· 54· 72· 112. QS computes gcd{2759, 53 · 57 · 58 −√
50 · 490 · 605} = 31.
An example of the quadratic sieve
Let’s try to factor N = 2759.
Recall idea: If a2− N is a square b2 then N = (a − b)(a + b).
532− 2759 = 50 = 2 · 52. 542− 2759 = 157.
552− 2759 = 266.
562− 2759 = 377.
572− 2759 = 490 = 2 · 5 · 72.
582− 2759 = 605 = 5 · 112.
The product 50 · 490 · 605 is a square: 22· 54· 72· 112. QS computes gcd{2759, 53 · 57 · 58 −√
50 · 490 · 605} = 31.
An example of the quadratic sieve
Let’s try to factor N = 2759.
Recall idea: If a2− N is a square b2 then N = (a − b)(a + b).
532− 2759 = 50 = 2 · 52. 542− 2759 = 157.
552− 2759 = 266.
562− 2759 = 377.
572− 2759 = 490 = 2 · 5 · 72. 582− 2759 = 605 = 5 · 112.
The product 50 · 490 · 605 is a square: 22· 54· 72· 112. QS computes gcd{2759, 53 · 57 · 58 −√
50 · 490 · 605} = 31.
An example of the quadratic sieve
Let’s try to factor N = 2759.
Recall idea: If a2− N is a square b2 then N = (a − b)(a + b).
532− 2759 = 50 = 2 · 52. 542− 2759 = 157.
552− 2759 = 266.
562− 2759 = 377.
572− 2759 = 490 = 2 · 5 · 72. 582− 2759 = 605 = 5 · 112.
The product 50 · 490 · 605 is a square: 22· 54· 72· 112.
QS computes gcd{2759, 53 · 57 · 58 −√
50 · 490 · 605} = 31.
An example of the quadratic sieve
Let’s try to factor N = 2759.
Recall idea: If a2− N is a square b2 then N = (a − b)(a + b).
532− 2759 = 50 = 2 · 52. 542− 2759 = 157.
552− 2759 = 266.
562− 2759 = 377.
572− 2759 = 490 = 2 · 5 · 72. 582− 2759 = 605 = 5 · 112.
The product 50 · 490 · 605 is a square: 22· 54· 72· 112. QS computes gcd{2759, 53 · 57 · 58 −√
50 · 490 · 605} = 31.
Quadratic Sieve running time
I How do we choose B?
I How many numbers do we have to try to factor?
I Depends on (heuristic) probability that a randomly chosen number is B-smooth.
Running time: Ln(1/2) = e
√
ln n ln ln n.
Number field sieve
Best running time for general purpose factoring
Insight
I Replace relationship a2 = b2mod N with a homomorphism between ring of integers OK in a specially chosen number field and ZN.
Algorithm
1. Polynomial selection Find a good choice of number field K . 2. Relation finding Factor elements over OK and over Z.
3. Linear algebra Find a square in OK and a square in Z 4. Square roots Take square roots, map into Z, and hope we
find a factor.
Running time: Ln(1/3) = e(ln n ln ln n)1/3.
Discrete log: State of the art
Discrete log: Compute g`= t in some group.
(Recall Diffie-Hellman.)
I For decades, progress in discrete log closely followed progress in factoring.
I Best running time was L(1/3) using number field or function field sieve.
... until this year...
A new index calculus algorithm with complexity L(1/4 + o(1)) in very small characteristic. Antoine Joux, February 2013
A quasi-polynomial algorithm for discrete logarithm in finite fields of small characteristic. Razvan Barbulescu, Pierrick Gaudry, Antoine Joux, and Emmanuel Thom´e, June 2013
Discrete log: State of the art
Discrete log: Compute g`= t in some group.
(Recall Diffie-Hellman.)
I For decades, progress in discrete log closely followed progress in factoring.
I Best running time was L(1/3) using number field or function field sieve.
... until this year...
A new index calculus algorithm with complexity L(1/4 + o(1)) in very small characteristic. Antoine Joux, February 2013
A quasi-polynomial algorithm for discrete logarithm in finite fields of small characteristic. Razvan Barbulescu, Pierrick Gaudry, Antoine Joux, and Emmanuel Thom´e, June 2013
Index calculus for discrete log.
Goal: Solve g`≡ t mod p.
1. Relation finding: Collect gr ≡ p1r1. . . pkrk mod p for many choices of r . This gives us a congruence of discrete logarithms for the pi:
r ≡ r1logg p1+ · · · + rkloggpk mod (p − 1) 2. Linear algebra Once we have enough relations, use linear
algebra to solve for the loggpi.
3. Smooth relation Search through values of R until we find one that results in a smooth value for gRt:
gRt ≡ pτ11. . . pkτk mod p 4. Output discrete log:
loggt ≡ −R + τ1logg p1+ · · · + τkloggpk mod (p − 1)
Recent discrete log progress
The new discrete log algorithms have resulted in very impressive new computational results.
We still need to verify whether heuristic assumptions line up in practice.
Open problem: Do these new results mean anything for discrete logs in large characteristic (working mod p?)
Open problem: Do these new results lead to new results in factoring?
Factoring in practice
1977 Rivest estimates factoring a 125-digit number would require 40 quadrillion years. (Letter to Martin Gardner)
1978 “An 80-digit n provides moderate security against an attack using current technology; using 200 digits provides a margin of safety against future
developments.” – A Method for Obtaining Digital Signatures and Public-Key Cryptosystems
1994 Atkins, Graff, Lenstra, Leyland factor RSA-129 challenge from Martin Gardner’s 1977 column. “We conclude that commonly-used 512-bit RSA moduli are vulnerable to any organiation prepared to spend a few million dollars and wait a few months.”
Factoring in practice
1999 te Riele, Cavallar, Dodson, Lenstra, Lioen, Montgomery, Murphy factor (512-bit) RSA-155 challenge in 4 months. Spent 35.7 CPU years.
2009 Benjamin Moody factors 512-bit RSA signing key for TI-83+ calculator in 73 days on a 1.9 GHz dual-core processor.
2012 Zachary Harris factors Google’s 512-bit email domain authentication key in 72 hours for $75 using Amazon ec2.
An amateur can use off the shelf software to factor
a 512-bit RSA modulus in about a day for a few
hundred dollars.
Factoring in the cloud
Amazon Elastic Compute Cloud http://aws.amazon.com/ec2/
+
Cluster computing toolkit for ec2 http://star.mit.edu/cluster/
+
Number field sieve implementation http://cado-nfs.gforge.inria.fr/
My first attempt: Running time details
30 hours to factor a 512-bit modulus,$234 in computation time.
I Polynomial selection: 7 minutes (parallel)
I Sieving: 14 hours (parallel)
I Generating matrix: 1 hour 30 minutes (single core)
I Linear algebra: 14 hours (partially parallel)
I Square root: 23 minutes (single core)
21 ”8x large compute optimized cluster compute instances”
= 20 spot instances ($0.27/hour)
+ 1 on demand instance control node ($2.40/hour)
The dirtier details of factoring
+14 hours sieving first attempt
+ 6 hours dealing with running out of disk space and StarCluster crashing when I tried to increase size of volume, gave up and restarted
+ 30 minutes figuring out why CADO-NFS crashed parallelizing linear algebra across 21 nodes until I figured out it wouldn’t crash if run with 16 nodes
$550 total Amazon bill over four days including time to figure out installation, configuration, test smaller experiments and false starts
Suggestions for future experiments and optimizations
Likely improvements
I Experiment with using more (cheaper) instances for sieving (used about 75% CPU on each instance)
I Experiment with effect of oversieving more to make linear algebra easier
I Optimize parallelization for linear algebra step. Linear algebra is only partially parallelized, and the parts that were
parallelized only used 15% of available CPU on each instance.
How hard is it to factor longer keys?
key size ECU years estimated cost
512 0.46 $137
768 2330 $422,000
896 1.5 · 105 $38 million 1024 2.8 · 106 $560 million 2048 2.8 · 1015 $ 1017
Using the cloud to determine key strengthsT. Kleinjung, A.K.
Lenstra, D. Page, and N.P. Smart. 2012