Comparing the Difficulty of Factorization and Discrete Logarithm: a 240-digit Experiment

(1)

Comparing the Difficulty of Factorization and

Discrete Logarithm: a 240-digit Experiment

Solving RSA-240, DLP-240, RSA-250

Fabrice Boudot, Pierrick Gaudry, Aurore Guillevic,

Nadia Heninger, Emmanuel Thom´e, Paul Zimmermann

FB: Ministère de l’Éducation Nationale, Université de Limoges PG+AG+ET+PZ: Université de Lorraine, CNRS, Inria, Nancy

NH: University of California, San Diego

(2)

Plan

Introduction

The Number Field Sieve

Collecting relations

Linear algebra

Software and computer resources

(3)

How does one choose

key sizes

?

When setting up crypto, a major decision is thekey size.

Efficiency: want shorter keys. Security: want longer keys.

Acompromiseis needed. An end user might. . . trust the manufacturer to do The Right Thing.

check that it abides by recommendations by regulatory bodies (NIST, ANSSI, BSI, . . . ).

Tricky questions for public-key crypto

How does oneassess the hardnessof cryptanalysis, for key sizes that are

(fortunately) out of its reach?

(4)

We need hard facts

Predictions can only be based on state-of-the-art software implementation performance.

We need actual software that is fit for large sizes, together with convincing computational results.

Explore algorithmic ideas that pay off only for large sizes. Explore scalability, try to address stumbling blocks.

Harness large computing power, show that this is more than just

theory.

Make our work reproducible.

(5)

IF versus FF-DLP

We look attwoimportant problems:

IF: Integer Factorization;

FF-DLP: Finite Field Discrete Logarithm Problem.

The appreciation of theirrelative difficulty is hard to do because IF and

FF-DLP records are usually done out of sync.

(6)

Plan

Introduction

Linear algebra

(7)

Summary of NFS

The NFS algorithm (1990) proceeds through many steps. Example workflow in the IF case.

N

polynomial

selection sieving filtering

linear algebra

square root

p,q

Computational requirements are diverse.

Sieving (relation collection) is the most expensive.

It can be massively distributed.

(Sparse) Linear algebra comes second.

(8)

What kind of

relations

does NFS collect?

Polynomial selectionfindsf with aknown root m modN.

LetQ(α) be the number field defined byf.

Bird’s eye view of strategy for factoring Search for pairs of integers(a,b) such that

a−bm

(an integer) and

a−bα

(an ideal inQ(α))

are both smooth: they factor into small things. Pairs yield relations.

Combine relationsso that all multiplicities are even. We (almost) have squares on both sides.

With further (easy) work, we find manyequalities of squares:

u2 ≡v2 modN leads to factors of N with probability at least 1₂.

(9)

NFS also works for FF-DLP

NFS works similarly for the discrete logarithm.

N becomesp. Note thatZ/pZ is a field.

Both sides are number fields. Not an issue.

FF-DLP version of NFS is no longer a story of finding squares.

We no longer seek even valuations and linear algebra overZ/2Z, but

linear algebra overZ/`Z, with ` a (large) prime factor ofp−1.

It’s harder.

(10)

Plan

Introduction

Linear algebra

(11)

Collecting relations

Relation collection is the most expensive step of NFS.

Description of relation collection

1. How do we divide the work?

2. How do we find smooth a−bm anda−bα?

3. How do we choose parameters so that the cost of linear algebra

(12)

1. (many) needles in a (huge) haystack

Searching a space of size (say) 265 takes long.

Trivial strategy (loop overa, loop over b) has unstable yieldand

does not work well.

Better: constrain a factorq in one of the factorizations.

Independent tasks perq. Yield is stable.

The prescribed factor is one thing less to find! (old folklore; records have been doing this for decades.) (⇒special-q sieving, lattice sieving, sieving by vectors.)

(13)

2. Finding smooth (

a

,

b

)

We have fixed aq.

We explore many (a,b) such that q appears somewhere.

We wanta−bm anda−bα to besmooth.

Strategy depends on potential prime factorsp.

A prime should appear either often, or very rarely.

below some bound, for p <B, strive to find allpairs (a,b) such that

p appears in the factorization.

We typically use a process called sieving.

“large primes” (LPs) such that B ≤p <L: allowed if we happen to find them.

(14)

The relations that we like to see

52_·₁₁_·₂₃_·₂₈₇₀₉₃_·₈₇₀₉₅₃_·_20179693_·_28306698811_·_47988583469 ₂3_·₅_·₇_·₁₃_·₃₁_·₆₁_·₁₄₄₀₇_·_26563253_·_86800081_·_269845309_·802234039_·_1041872869_·_5552238917_·_12144939971_·_15856830239 3·1609·77699·235586599·347727169·369575231·9087872491 23_·₃_·₅_·₁₃_·₁₉_·₂₃_·₃₁_·₅₉_·₂₃₉_·₃₉₈₉_·₇₉₅₁_·_2829403_·_31455623_·_225623753_·811073867_·_1304127157_·_78955382651_·_129320018741 5·1381·877027·15060047·19042511·11542780393·13192388543 24_·₅_·₁₃_·₃₁_·₅₉_·₈₂₃_·₂₈₀₁_·₂₆₅₃₉_·_2944817_·_3066253_·_87271397_·_108272617_·_386616343_{·815320151·}_1361785079_·_12322934353 23_·₅2_·₁₇₃_·₉₇₁_·_613909489_·_929507779_·_1319454803_·_2101983503 ₂7_·₃2_·₅_·₂₉_·₁₀₂₁_·₄₂₅₈₉_·₁₉₀₅₀₇_·₄₇₃₂₈₇_·_31555663_·_654820381_{·802234039·}_19147596953_·_23912934131_·_52023180217 22_·₁₅₁₉₃_·₂₃₂₈₉₁_·_19514983_·_139295419_·_540260173_·_606335449 ₂2_·₃4_·₁₃_·₁₉_·₇₄₈₉₇_·_1377667_·_55828453_·_282012013_·802234039_·_3350122463_·_35787642311_·_37023373909_·_128377293101 22_·₅4_·₄₃₉_·₁₄₈₃_·₁₃₁₂₁_·₂₁₃₈₃_·₆₇₇₅₁_·_452059523_·_33099515051 ₂2_·₃3_·₁₁_·₁₃_·₁₉_·₅₀₂₃_·_3683209_·_98660459_{·802234039·}_1506372871_·_4564625921_·_27735876911_·_32612130959_·_45729461779

small primes: abundant→ dense column in the matrix

large primes: rare →sparse colum, limit to 2 or 3 on each side.

Before linear algebra, thefilteringstep tries to do as manycheap

combinationsas it can, so as to get a smaller matrix.

(15)

The relations that we like to see

52_·₁₁_·₂₃_·₂₈₇₀₉₃_·₈₇₀₉₅₃_·_20179693_·_28306698811_·_47988583469 ₂3_·₅_·₇_·₁₃_·₃₁_·₆₁_·₁₄₄₀₇_·_26563253_·_86800081_·_269845309_·802234039_·_1041872869_·_5552238917_·_12144939971_·_15856830239 3·1609·77699·235586599·347727169·369575231·9087872491 23_·₃_·₅_·₁₃_·₁₉_·₂₃_·₃₁_·₅₉_·₂₃₉_·₃₉₈₉_·₇₉₅₁_·_2829403_·_31455623_·_225623753_·811073867_·_1304127157_·_78955382651_·_129320018741 5·1381·877027·15060047·19042511·11542780393·13192388543 24_·₅_·₁₃_·₃₁_·₅₉_·₈₂₃_·₂₈₀₁_·₂₆₅₃₉_·_2944817_·_3066253_·_87271397_·_108272617_·_386616343_{·815320151·}_1361785079_·_12322934353 23_·₅2_·₁₇₃_·₉₇₁_·_613909489_·_929507779_·_1319454803_·_2101983503 ₂7_·₃2_·₅_·₂₉_·₁₀₂₁_·₄₂₅₈₉_·₁₉₀₅₀₇_·₄₇₃₂₈₇_·_31555663_·_654820381_{·802234039·}_19147596953_·_23912934131_·_52023180217 22_·₁₅₁₉₃_·₂₃₂₈₉₁_·_19514983_·_139295419_·_540260173_·_606335449 ₂2_·₃4_·₁₃_·₁₉_·₇₄₈₉₇_·_1377667_·_55828453_·_282012013_·802234039_·_3350122463_·_35787642311_·_37023373909_·_128377293101 22_·₅4_·₄₃₉_·₁₄₈₃_·₁₃₁₂₁_·₂₁₃₈₃_·₆₇₇₅₁_·_452059523_·_33099515051 ₂2_·₃3_·₁₁_·₁₃_·₁₉_·₅₀₂₃_·_3683209_·_98660459_{·802234039·}_1506372871_·_4564625921_·_27735876911_·_32612130959_·_45729461779

small primes: abundant→ dense column in the matrix

large primes: rare →sparse colum, limit to 2 or 3 on each side.

Before linear algebra, thefilteringstep tries to do as manycheap

(16)

3. Paying attention to the combination cost

Relations with 2 LPs or less are a blessing.

They easily participate in cheap combinations.

If we have only 2-LP relations, filtering will get rid of most of them. We are left with a number of primes to combine that is roughly the

number of primes below B.

Caveat: two sides to deal with.

We mustpay attention to q as well! How does it compare toB?

(17)

Strategy for RSA-240

q 229.6 0.8e9 2 31 2.1e9 B 232.8 7.4e9 2 36 69e9 L q<B: allow 2 LPs on side 0, 3 LPs on side 1.

B≤q<L: allow 2 LPs on each side. (q counts as an extra LP on side 1.)

This strategy makes it easy to get rid of mostp ≥B on side 0 before we

enter linear algebra proper.

We still have many on side 1, but that is not too bad because linear algebra in the factoring context is reasonable.

(18)

Strategy for DLP-240

For DLP-240, we usedcompositeq, to avoid the disadvantage of having

q in the LP range. q 213 8e3 2 26.5 1e8 2 29 0.5e9 B 235 34e9 L 237.1 150e9 2 38.1 300e9 qi,qj (prime factors of q) q =qiqj

Allow 2 LPs on each side. (Factors ofq are not LPs.) This strategy was efficient in reducing the combination work to essentially

primesp <B only.

(19)

Alternative to sieving

Another important ingredient that we used. Fun way to find the

B-smoothpart of many integers (say, manya−bm’s) (Bernstein, 2000):

Multiply all primes together! P =Q

p<Bp.

Multiply all integers together! M =Q

(a,b)∈S(a−bm). (keeping track of the tree of subproducts)

ComputeP modM, then P mod the two halves and so on, down

to {P mod (a−bm)}₍_a_,_b₎_∈S.

This has pros and cons.

Asymptotically fast with FFT-like algorithms.

Finds all primes <B (so does sieving).

Requires some memory, and not trivial to parallelize. But allows to save memory in other steps.

(20)

Alternative to sieving

Another important ingredient that we used. Fun way to find the

B-smoothpart of many integers (say, manya−bm’s) (Bernstein, 2000):

Multiply all primes together! P =Q

p<Bp.

Multiply all integers together! M =Q

(a,b)∈S(a−bm). (keeping track of the tree of subproducts)

ComputeP modM, then P mod the two halves and so on, down

to {P mod (a−bm)}₍_a_,_b₎_∈S.

This has pros and cons.

Asymptotically fast with FFT-like algorithms.

Finds all primes <B (so does sieving).

Requires some memory, and not trivial to parallelize. But allows to save memory in other steps.

(21)

Plan

Introduction

Linear algebra

(22)

Rules of the game

Everything reduces to linear algebra

Integer Factorization: combining relations so as to obtain even

valuations is a linear algebra problem, overZ/2Z.

FF-DLP: not the same goal, but still a linear algebra problem, this

time overZ/`Zwhere` hasseveral hundred bits.

Matrix dimensions RSA-240: nrows=282M; 200 non-zero/row.

DLP-240: nrows=37M; 250 non-zero/row.

As the matrix issparse, we use an iterative algorithm.

Key operation is SpMV: sparse matrix times vector.

(23)

Better scaling

We use theblock Wiedemann algorithm(1994).

iterative, withn (shorter) independent sequences.

(Almost) linear scaling in the iterative part, with no communication.

work can be reconciled by computing a generator(≈χ).

The more sequences we use, the more expensive it gets.

Cost metrics withn sequences (assuming n <log nrows)

Time Memory main operations

Per sequence O(nrows2/n)

n-way distributed

O(nrows) movq,addq

(24)

Plan

Introduction

Linear algebra

(25)

The CADO-NFS software

We used theCADO-NFSsoftware.

Important software development effort since 2007. 250k lines of C/C++ code.

Open source (LGPL), open development model (gitlab).

Regarding relation collection only:

An important part of CADO-NFS (60k lines) Significant improvements since 2016.

improved parallelism: strive to get rid of scheduling bubbles; versatility: large freedom in parameter selection;

(26)

Work on linear algebra

Linear algebra is a big part ofCADO-NFS(60k lines C/C++).

SpMV uses MPI+threads, some low-level assembly. 2016: big performance improvements.

Generator step uses asymptotically fast FFT-based algorithms, MPI, threads, . . .

2019–: better parallelization, more flexible (still WIP).

We can achieve fairly good scaling

Useseveral sequences. Do SpMV in parallel as well.

RAM in generator step: be careful! 0 20 40 60 80 100 120 140 160 0 50000 100000 150000 200000 250000 26 ₂7 ₂8 ₂9 ₂10 ₂11 ₂12 ₂13 ₂14 ₂15 ₂16 cores 22 23 24 25 26 27 28 29 210 211 212

time to solution (days)

n=48 n=32 n=24 n=16 n=8 n=4 n=2 n=1

(27)

Computer resources used

We used several computer clusters. Mostly recent hardware.

Grid5000clusters (FR) in best-effort mode.

Hardware from University of Pennsylvania. (later moved to UCSD) (the move took time). Campus computer cluster in Nancy, France.

32M core-hours compute allocation on EU PRACE infrastructure. Note: each platform comes with its usage specificities (hardware, software, policies, faults, . . . ).

(28)

Plan

Introduction

Linear algebra

(29)

Approximative timeline and core-hours

2018/08 - 2019/03 DLP-240relation collection. 21M c·h

4k cores working in parallel.

2019/05 - 2019/08 DLP-240linear algebra (sequences) 5M c·h

2019/04 - 2019/06 RSA-240relation collection. 7M c·h

4.3k cores working in parallel.

2019/10 - 2020/02 RSA-250relation collection. 21M c·h

12k cores working in parallel.

2019/07 - 2019/08 RSA-240linear algebra (sequences) 0.6M c·h

2019/11 RSA-240linear algebra (wrap up) 0.1M c·h

DLP-240linear algebra (wrap up) 0.7M c·h

2020/02 RSA-250linear algebra 2M c·h

(30)

Total cost

Aggregated cost in physical core-years (c·y) and core-hours (c·h)

(platform details in paper).

sieving matrix

RSA-240 800 c·y (7Mc·h) 83 c·y (0.7Mc·h)

DLP-240 2,400 c·y (21Mc·h) 625 c·y (6Mc·h)

RSA-250 2,450 c·y (21Mc·h) 250 c·y (2Mc·h)

Extensive information on the computational data, the parameters, and how to reproduce (parts of) our computation can be found at

https://gitlab.inria.fr/cado-nfs/records/.

(31)

Conclusions

More than just records, we developed efficientparameterization

strategies for further computations.

We developed anextensive simulation framework to guide the

parameter choices. Not perfect.

We show that our implementationscales well and can tackle larger

problems. No technology barrier at this point. Comparisons:

Comparison with previous record (DLP-768, 232 digits, 2016): Onidentical hardware, our DLP-240 computation would have taken

less time than the 232-digits computation.

FF-DLP is not muchharder than integer factoring.

For future projects, we intend to keep the focus on our capacity to anticipate the computational cost, and to harness large computing power.