• No results found

Quantitative security of block ciphers:designs and cryptanalysis tools

N/A
N/A
Protected

Academic year: 2021

Share "Quantitative security of block ciphers:designs and cryptanalysis tools"

Copied!
257
0
0

Loading.... (view fulltext now)

Full text

(1)

POUR L'OBTENTION DU GRADE DE DOCTEUR ÈS SCIENCES

PAR

ingénieur en systèmes de communication EPF

de nationalités française et suisse et originaire de Vira (Gambarogno) (TI)

acceptée sur proposition du jury: Prof. A. Lenstra, président du jury Prof. S. Vaudenay, directeur de thèse

Dr H. Gilbert, rapporteur Prof. S. Morgenthaler, rapporteur

Prof. J. Stern, rapporteur

Quantitative Security of Block Ciphers:

Designs and Cryptanalysis Tools

Thomas BAIGNÈRES

THÈSE N

O

4208 (2008)

ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE

PRÉSENTÉE LE 14 NOVEMBRE 2008

À LA FACULTÉ INFORMATIQUE ET COMMUNICATIONS LABORATOIRE DE SÉCURITÉ ET DE CRYPTOGRAPHIE

(2)
(3)

Contents

I An Introduction to Modern Cryptology and an Approach to the

Design and Cryptanalysis of Block Ciphers 1

1 Shannon’s Theory of Secrecy 3

1.1 The Encryption Model: Preserving Confidentiality . . . 3

1.2 Perfect Secrecy and the Vernam Cipher . . . 5

1.3 Going Beyond Perfect Secrecy . . . 6

1.4 Thesis Outline . . . 6

2 Computationally Bounded Adversaries 9 2.1 Black Box Attacks: Determining the Secret Key Length . . . 9

2.2 New Directions in Cryptography: reducing Confidentiality to Authenticity 11 3 Block Ciphers Design: a Top-Down Approach 13 3.1 Iterated Block Ciphers and Key Schedules . . . 13

3.2 Round Functions Based on Feistel Schemes . . . 14

3.3 Round Functions Based on Lai-Massey Schemes . . . 15

3.4 Round Functions Based on Substitution-Permutation Networks . . . 15

3.5 Providing Diffusion: on the Need for Multipermutations . . . 16

3.6 Providing Confusion: Mixing key bits . . . 17

3.7 The Advanced Encryption Standard . . . 17

4 The Luby-Rackoff Model: Statistical Attacks against Block Ciphers 19 4.1 The Perfect Cipher and Security Models . . . 19

4.2 From Distinguishing to Key Recovery . . . 20

4.3 Linear Cryptanalysis . . . 22

5 Notations and Elementary Results 25 5.1 Random Variables, Probabilities, Strings, etc. . . 25

5.2 Vector Norms and Fundamental Inequalities . . . 26

(4)

II On the (In)Security of Block Ciphers:

Tools for Security Analysis 29

6 Distinguishers Between Two Sources 31

6.1 A Typical Introduction to Simple Hypothesis Testing . . . 31

6.2 An Alternate View through the Method of Types . . . 33

6.3 The Best Distinguisher: an Optimal Solution . . . 36

6.4 The Best Distinguisher: Data Complexity Analysis . . . 39

6.5 The Best Distinguisher: Examples and Pathological Distributions . . . . 48

6.6 The Best Distinguisher: Case where the Distributions are Close to Each Other . . . 50

6.7 The Best Distinguisher: Case where one of the Distributions is Uniform 53 6.8 The Best Distinguisher: Case where one Hypothesis is Composite . . . . 54

6.9 A General Heuristic Method to Compute the Advantage of an Arbitrary Distinguisher . . . 56

6.10 Case where One of the Distributions is Unknown: the Squared Distin-guishers Family . . . 58

7 Projection-Based Distinguishers Between two Sources 67 7.1 On the Need for New Distinguishers . . . 67

7.2 Best Distinguisher made Practical Using Compression . . . 68

7.3 Linear Distinguishers for Binary Sources . . . 70

7.4 Links between Best, Projection-Based, and Linear Distinguishers for Bi-nary Sources . . . 72

7.5 Extending the Notion of Linear Probability to Arbitrary Sets . . . 81

7.6 Linear Distinguishers for Sources over Arbitrary Sets . . . 84

7.7 A Fundamental Link Between Projection-Based and Linear Distinguishers 87 7.8 Links with Differential Cryptanalysis . . . 92

8 Projection-Based Distinguishers Between two Oracles 95 8.1 From Random Sources to Random Oracles . . . 95

8.2 Cryptanalysis Complexity by means of Transition and Bias Matrices . . 97

8.3 Piling-up Transition Matrices . . . 101

8.4 Generalized Linear Cryptanalysis of Block Ciphers . . . 105

8.5 The Block CipherDEAN: a Toy Example for our Generalization of Linear Cryptanalysis . . . 113

8.6 A Z16 100 Generalized Linear Cryptanalysis ofTOY100 . . . 115

9 A Generalized Linear Cryptanalysis of SAFER K/SK 119 9.1 TheSAFER Family . . . 120

9.2 Linear Cryptanalysis ofSAFER: from Z8 2 toZ28 . . . 123

9.3 Attacks on Reduced-Round Versions of SAFER . . . 128

9.4 Implementation of the Attack on 2 Rounds . . . 132

(5)

Contents

III Block Cipher Designs and Security Proofs 137 10 Provable Security and the Decorrelation Theory 139

10.1 The Luby-Rackoff Model . . . 141

10.2 Computing the Advantage by means of Distribution Matrices . . . 142

10.3 From Linear Cryptanalysis and Differential Cryptanalysis to other Iter-ated Attacks . . . 147

10.4 Decorrelation of Feistel Ciphers . . . 150

10.5 Decorrelation Modules: Avoiding Algebraic Constructions . . . 152

11 Dial C for Cipher 157 11.1 A Description of the Block CipherC . . . 157

11.2 Exact Security against2-limited Adversaries . . . 161

11.3 Consequences for Iterated Attacks of Order 1, Linear and Differential Cryptanalysis . . . 168

11.4 Exact Security against Linear and Differential Cryptanalysis . . . 169

11.5 Towards the Perfect Cipher . . . 174

11.6 Provable Security against Impossible Differentials . . . 175

11.7 Taking the Key-Schedule into Account . . . 177

11.8 Unproved Security against other Attacks . . . 180

11.9 A Fast Variant of Cwithout Security Compromise . . . 181

11.10Implementation and Performances . . . 182

11.11Summary . . . 184

12 KFC: the Krazy Feistel Cipher 185 12.1 From the SPN of Cto the Feistel Network of KFC . . . 186

12.2 A Good Round Function for the Feistel Scheme . . . 186

12.3 Exact Security ofFKFC against2-limited Adversaries . . . 189

12.4 Bounding the Security of FKFC against Adversaries of Higher Order . . . 194

12.5 KFC in Practice . . . 201

12.6 Further Improvements . . . 202

13 Conclusion and Future Work 203

IV Appendixes 205

A A Proof of Sanov’s Theorem 207

B Proof of Lemma 6.6 211

C Proofs of the Lemmas Used in Example 7.3 215 D The Substitution Box of DEAN27. 217

(6)

E Complementary Informations onSAFER 219

5.1 List of Some of the Possible Successions of Patterns on the Linear Layer 219 5.2 Sequences of Three Weights . . . 224 5.3 Complexities of the Attacks against 3, 4, and 5 Rounds . . . 224

(7)

Abstract

Block ciphers probably figure in the list of the most important cryptographic primitives. Although they are used for many different purposes, their essential goal is to ensure confidentiality. This thesis is concerned by theirquantitative security, that is, by measurable attributes that reflect their ability to guarantee this confidentiality.

The first part of this thesis deals with well know results. Starting with Shan-non’s Theory of Secrecy, we move to practical implications for block ciphers, recall the main schemes on which nowadays block ciphers are based, and introduce the Luby-Rackoff security model. We describe distinguishing attacks and key-recovery attacks against block ciphers and show how to turn the firsts into the seconds. As an illustration, we recall linear cryptanalysis which is a classical example of statistical cryptanalysis.

In the second part, we consider the (in)security of block ciphers against sta-tistical cryptanalytic attacks and develop some tools to perform optimal attacks and quantify their efficiency. We start with a simple setting in which the adversary has to distinguish between two sources of randomness and show how an optimal strategy can be derived in certain cases. We proceed with the practical situation where the cardinality of the sample space is too large for the optimal strategy to be implemented and show how this naturally leads to the concept of projection-based distinguishers, which reduce the sample space by compressing the samples. Within this setting, we re-consider the particular case of linear distinguishers and generalize them to sets of arbitrary cardinality. We show how these distinguishers between random sources can be turned into distinguishers between random oracles (or block ciphers) and how, in this setting, one can generalize linear cryptanalysis to Abelian groups. As a proof of concept, we show how to break the block cipher TOY100, introduce the block cipher

DEAN which encrypts blocks of decimal digits, and apply the theory to the SAFER

block cipher family.

In the last part of this thesis, we introduce two new constructions. We start by recalling some essential notions about provable security for block ciphers and about Serge Vaudenay’s Decorrelation Theory, and introduce new simple modules for which we prove essential properties that we will later use in our designs. We then present the block cipher C and prove that it is immune against a wide range of cryptanalytic attacks. In particular, we compute theexact advantage of the best distinguisher limited to two plaintext/ciphertext samples between C and the perfect cipher and use it to compute the exact value of the maximum expected linear probability (resp. differential probability) ofCwhich is known to be inversely proportional to the number of samples

(8)

required by the best possible linear (resp. differential) attack. We then introduceKFC

a block cipher which builds upon the same foundations asCbut for which we can prove results for higher order adversaries. We conclude both discussions aboutCandKFCby implementation considerations.

Keywords: Cryptography, block cipher, statistical cryptanalysis, linear cryptanalysis, hypothesis testing, SAFER, Decorrelation Theory

(9)

R´esum´e

Les algorithmes de chiffrement `a clef secr`ete font tr`es certainement partie des primitives cryptographiques les plus importantes. Bien qu’ils soient utilis´es `a des fins tr`es diverses, leur principale fonction est d’assurer la confidentialit´e des donn´ees. Cette th`ese s’int´eresse `a leur s´ecurit´e quantitative, c’est-`a-dire aux attributs mesurables qui refl`etent leur habilit´e `a garantir cette confidentialit´e.

La premi`ere partie de cette th`ese traite d’un certain nombre de r´esultats bien connus. En partant de la th´eorie du secret de Shannon, nous consid´erons les implications pratiques pour les algorithmes de chiffrement `a clef secr`ete, nous rappelons les sch´emas ´el´ementaires sur lesquels ces derniers sont con¸cus, et introduisons le mod`ele de Luby et Rackoff. Nous d´ecrivons les attaques visant `a distinguer une permutation al´eatoire d’une autre puis les attaques dont l’objectif est de retrouver la clef secr`ete pour enfin montrer comment les premi`eres peuvent entraˆıner les deuxi`emes. En guise d’exemple, nous rappelons les concepts de la cryptanalyse lin´eaire qui est un exemple classique de cryptanalyse statistique.

Dans la deuxi`eme partie, nous consid´erons l’(in)s´ecurit´e des algorithmes de chiffrement `a clef secr`ete face au attaques cryptanalytiques statistiques et d´eveloppons quelques outils pour ex´ecuter certaines attaques et quantifier leur efficacit´e. Nous con-sid´erons un cadre initial tr`es simple dans lequel un adversaire doit distinguer une source al´eatoire d’une autre et montrons que, dans certains cas, une strat´egie optimale peut ˆetre trouv´ee. Nous traitons ensuite le cas pratique dans lequel la cardinalit´e de l’espace ´echantillon est trop grande pour que la strat´egie optimale puisse ˆetre utilis´ee telle quelle, ce qui entraˆıne naturellement la d´efinition dedistingueurs bas´es sur des projections qui r´eduisent l’espace en compressant chaque ´echantillon. Dans cette optique, nous recon-sid´erons le cas des distingueurs lin´eaires et les g´en´eralisons aux ensembles de cardinalit´e arbitraire. Nous montrons comment ces distingueurs entre des sources al´eatoires peu-vent ˆetre transform´es en distingueurs entre des oracles al´eatoires et comment, de cette fa¸con, il est possible de g´en´eraliser la cryptanalyse lin´eaire aux groupes Ab´eliens. En guise de preuve de concept, nous montrons comment casser l’algorithme de chiffrement

TOY100, introduisons l’algorithme DEAN qui permet de chiffrer des blocs de chiffres d´ecimaux, et appliquons la th´eorie `a la famille d’algorithmesSAFER.

Dans la derni`ere partie de cette th`ese, nous proposons deux nouvelles con-structions. Nous commen¸cons par rappeler quelques notions essentielles concernant la s´ecurit´e prouv´ee des algorithmes de chiffrement `a clef secr`ete et la Th´eorie de la D´ecorr´elation d´evelopp´ee par Serge Vaudenay. Nous introduisons de nouveaux modules

(10)

pour lesquels un certain nombre de r´esultats de s´ecurit´e peuvent ˆetre prouv´es et qui seront au coeur des deux constructions `a suivre. Nous pr´esentons ensuite l’algorithme de chiffrementCet prouvons sa s´ecurit´e contre une certain nombre d’attaques. En par-ticulier, nous calculons l’avantageexact du meilleur distingueur limit´e `a deux paires de textes clairs/chiffr´es entreCet l’algorithme de chiffrement parfait et utilisons ce r´esultat pour calculer la valeur exacte de la valeur moyenne maximum de la probabilit´e lin´eaire (ainsi que celle de la valeur moyenne de la probabilit´e diff´erentielle) deC que l’on sait ˆetre inversement proportionnelle au nombre d’´echantillons n´ecessaires pour mener une attaque concluante. Nous introduisons ensuite KFC, un algorithme qui repose sur les mˆemes bases que Cmais pour lequel nous arrivons `a prouver des r´esultats concernant des adversaires d’ordres plus ´elev´es. Dans les deux cas, nous concluons la discussion par des consid´erations exp´erimentales.

Mots-clefs: Cryptographie, algorithme de chiffrement `a clef secr`ete, cryptanalyse statis-tique, cryptanalyse lin´eaire, test d’hypoth`ese, SAFER, Th´eorie de la D´ecorr´elation

(11)

Remerciements

Je tiens en premier lieu `a remercier mon directeur de th`ese, Serge Vaudenay, sans qui ce manuscrit n’aurait jamais vu le jour. Source d’inspiration permanente, exigeant et talentueux, parfois dur mais toujours juste, int`egre et attentionn´e, il a souvent su aller au del`a de mes esp´erances. J’esp`ere que mon travail aura ´et´e `a la hauteur des siennes.

Merci `a tous les membres du jury de m’avoir accord´e leur temps, leurs conseils et leurs encouragements pour l’avenir. En particulier, merci `a Stephan Morgenthaler d’avoir pris la peine de s’´ecarter quelque peu de son domaine de pr´edilection, merci `a Jacques Stern pour avoir non seulement accept´e mon invitation mais aussi pour son aide regardant la g´en´eralisation de la cryptanalyse lin´eaire, merci `a Henri Gilbert pour son aide, sa gentillesse, mais aussi pour ses conseils ´eclair´es (d´ej`a bien avant la d´efense !). Merci `a Arjen Lenstra, pr´esident du jury, pour sa bonne humeur et son franc-parler !

Merci au Fond National Suisse d’avoir contribu´e `a la plupart de mes travaux (bourse 200021-107982/1).

Le LASEC ne serait pas ce qu’il est aujourd’hui sans ceux qui m’ont pr´ec´ed´es: merci `a Pascal Junod pour m’avoir donn´e l’envie d’aller plus loin dans la recherche (et pour ce qui restera sans doute ma bi`ere la plus inoubliable: Singapour, 30 aux petites

heures du matin, assis `a une table au beau milieu d’une rue d´eserte dans le quartier Indien), merci `a Jean Monnerat pour les discussions sans fins (au caf´e `a Lausanne, dans un boui-boui `a Shanghai, un bar high tech `a Saint-P´etersbourg), merci `a Gildas Avoine pour son humour (presque) toujours d´ecapant (les amis de P´epin se reconnaˆıtront). Merci `a Martine Corval pour son aide inestimable (et pour savoir toujours trouver les mots quand rien ne va plus !). Je n’oublie pas la rel`eve: merci `a Sylvain Pasini pour sa gentillesse sans ´egale, merci `a Martin Vuagnoux de toujours savoir partager son ´energie avec les autres ! Le “labo” a de beaux jours devant lui. Last but not least, merci `a mon “coll`egue de bureau” Matthieu Finiasz d’avoir support´e mon sale caract`ere pendant deux ans ! Ceux qui le connaissent ne me contrediront pas, sa gentillesse et `a la hauteur de son talent. Nous avons fait ensemble un travail que je crois formidable, et le r´ealiser m’a apport´e un immense bonheur.

La crypto, c’est rigolo ! peut-on lire sur une plaquette du d´epartement de Math´ematiques et d’Informatique de l’Ecole Normale Sup´erieure. La crypto, c’est aussi beaucoup de conf´erences et de voyages `a l’autre bout du monde pour y assister. Merci `a la communaut´e des cryptographes de m’y avoir fait passer de tr`es bon moments. Merci en particulier `a Fr´ed´eric Muller, Claude Barral, Thomas Peyrin, Raphael Overbeck,

(12)

Khaled Ouafi, Rafik Chaabouni, Antoine Joux, Willi Meier, David Naccache, Pas-cal Paillier, Phong Q. Nguyen, Julien Stern, Olivier Billet, Emmanuel Bresson, David Pointcheval, Jean-Philippe Aumasson, Kaisa Nyberg, et Raphael Phan.

Merci `a Chrissie et John Barlow pour leur amiti´e. Merci en particulier `a John pour son aide, ses conseils et ses corrections !

J’ai eu la chance de refaire le monde plusieurs fois avec eux (et il en a bien besoin): Merci `a Robert Bargmann, Numa Schmeder et Damien Tardieu de m’avoir fait partager cinq ann´ees inoubliables. Avec vous, travailler est redevenu un plaisir. Merci `a Thibaut Davain pour plus de 20 ans d’amiti´e inalt´erable.

Merci `a Jacqueline de m’avoir fait (re)-d´ecouvrir la Rafraire et ceux qui la font. Merci `a Danielle et Ren´e de m’y avoir accueilli comme si j’en avais toujours fait partie. Merci mille fois `a Caroline et Jacques-Andr´e pour autant de soir´ees inoubliables. Merci `a Jacques et Yol`ene pour leur soutient et leur bonne humeur !

Merci enfin `a Elena et Bernard qui, il y a presque quinze ans, m’ont accueilli comme si je faisais d´ej`a partie de la famille. Merci `a Stefaan et H´el`ene pour tous les bons moments pass´es depuis. C’est aussi `a votre soutien que je dois ma r´eussite.

Merci `a ma grand-m`ere qui a su me transmettre tant de choses, y compris son amour des math´ematiques. Merci `a mon oncle pour son aide et son affection.

Merci `a Yvonne et Mathias pour leur soutient de chaque instant. Merci `a Mathias pour ses conseils toujours avis´es et d’ˆetre l`a qu’en j’ai besoin de lui (merci pour les d´es !).

Merci enfin `a mes parents pour leur soutien inconditionnel et pour avoir tou-jours cru en moi. Je ne saurais exprimer ici l’amour et la gratitude immense que je leur porte. Ma r´eussite est aussi la votre.

Merci `a Val´erie de m’accompagner, de me soutenir, de partager, de croire en moi plus que je ne le fais moi-mˆeme depuis plus de quinze ans. Merci de m’apporter l’´equilibre qui sinon me ferait d´efaut. Merci d’avoir enchant´e ma vie.

(13)

A la m´emoire de mon p`ere.

So long as we live, he too shall live. For he is now a part of us, As we remember him.

(14)
(15)

Part I

An Introduction to Modern Cryptology

and an Approach to the Design and

Cryptanalysis of Block Ciphers

(16)
(17)

Chapter

1

Shannon’s Theory of Secrecy

The oldest concern of cryptography is probably to find the most efficient and elegant technique to transmit confidential information (through time or space) to a recipient, and to this recipient only. The first known reference to this problem dates back to quite ancient times, a fact that David Kahn illustrates from the very beginning of “The Code-Breakers” [78] by entitling the second chapter “The first 3,000 years”. During this period of time, cryptography fascinated not only the most important world leaders (Julius Caesar’s cipher is one of the first encryption method taught in almost every lecture on cryptography) but also the greatest artists and scientists. It is not surprising that several books relate its story [78,100,141,142] for a reason which is very clearly and concisely summarized in a (by now) famous leitmotiv propagated in the 90’s by a young cryptographer [149] of the “Ecole Normale Sup´erieure”:

La crypto c’est rigolo”.1

Yet, the bases of cryptography as a scientific discipline were only formulated in 1946 by Claude E. Shannon in the confidential report (by now declassified) “A Math-ematical Theory of Cryptography” [139]. Its mathMath-ematical analysis provides a formal statement of what defines a cryptographic system and what one should require from it. Shannon’s theory of secrecy is concerned with encryption methods which allow one to conceal information originally contained in a message (or plaintext) in a so-called ciphertext. Ideally, the ciphertext alone should not allow the recovery of information, so the fact that it is eavesdropped by some adversary cannot do any harm2.

1.1

The Encryption Model: Preserving Confidentiality

Shannon defines a secrecy system (or a cryptographic system) as “a set of transformations of one space (the set of possible messages) into a second space (the set

1Crypto is fun.

2Shannon makes reference to the “enemy” since at that time, cryptography was essentially of military concern.

(18)

Message Source T−1 K Decipherer Encipherer TK Key Source MessageM CryptogramC KeyK MessageM Enemy Cryptanalyst

Figure 1.1: Symmetric Encryption

of possible cryptograms).” [139]. Each of the transformations is indexed by akey which shall only be shared by the sender and the recipient of the message. This situation is illustrated in Figure1.1. Using the keyK and the Encipherer T, the sender encrypts the messageM and obtains the cryptogram (or ciphertext)

C=TK(M)

which is send over an insecure channel to the recipient, who recovers the original message M using the deciphererT1 as

M =TK1(C).

According to this scenario, all the transformations defined by the system should be invertible in order to allow the recipient to recover only the original plaintext from the ciphertext. The channel on which the ciphertext is sent is assumed to be insecure in the sense that the enemy cryptanalyst (or adversary) can eavesdrop any message on that channel.

The key K is sampled by the key source in the finite space of all the possible keys allowed by the system. The key is usually considered as a random variable following some a priori distribution (which is known by the adversary). In most cases, this distribution is assumed to be uniform. Similarly, the message M is sampled by the message source according to somea priori distribution which is generally non-uniform. Once the adversary has intercepted the ciphertextC, the new distributions of M and K are referred to as thea posteriori distributions, since the adversary can benefit from any information that can be extracted fromC. Intuitively, the level of security achieved by the system depends on how far the a posteriori distributions are from the a priori

distributions.

In this scenario, the secret key must be transmitted to both parties over a

secure (i.e., confidential) channel, which is obviously more “expensive” to use than the insecure one. This clearly makes sense when the encryption method is such that the message space from which M is chosen is larger than the key space. For example, modern encryption procedures (as block ciphers or stream ciphers) allow the encryption

(19)

Section 1.3 Perfect Secrecy and the Vernam Cipher

of several gigabytes of data with only one 128-bit key. But this model can also be meaningful when the message and the key are equal in length (which is mandatory when one aims at unconditional security, as we will see). In that case, one cananticipate

any potential difficulty in transmitting confidential information at a certain time t by transmitting the key at a timet0 < twhen such a transmission is easier.

Finally, this model assumes that the adversary knows the set from which the transformations TK and TK1 are chosen from. In other words, the adversary knows

the specifications of the cryptosystem that is used. Besides being conservative (which is often desirable from the point of view of security), this assumption has been proved correct in several situations and in particular when a large period of time is left to the adversary to break the system. This assumption corresponds to one of the most famous Kerckhoffs’ principles [83], according to which the security of a cryptosystem should not rely on the secrecy of the cryptosystem itself (which is not to say that one should necessarily make it public in practice).

1.2

Perfect Secrecy and the Vernam Cipher

Ideally, no information about the plaintext should leak from the ciphertextC. In other words, the a posteriori distribution of the message should be identical to its

a priori distribution so that an adversary with unlimited computational power cannot recover M (nor K) from C. We thus consider that the encryption system achieves perfect secrecy when

Pr[M =m|C =c] = Pr[M =m]

for any acceptable ciphertext cand message m, which also reads as H(M|C) =H(M),

whereH(·) denotes Shannon’s entropy [139, 157].

The Vernam cipher [160] is a stream cipher developed by Gilbert Sandford Vernam in 1926 which achieves perfect secrecy when the a priori distribution of the key is uniform [139] and when its length (at least) corresponds to that of the plaintext. Assuming that the plaintext and the key are represented as bit strings and that they are ofequal length, the Vernam cipher simply computes the ciphertext as

C=M ⊕K

where corresponds the bit-wise exclusive-or operation. For several reasons, the Ver-nam cipher is impractical: not only a secret key cannot be used twice, but it has to be uniformly distributed, which is hard to achieve in practice. Yet, the problem of the secret key length is not inherent to the Vernam cipher but to the nature of perfect secrecy, as the following theorem shows.

(20)

1.3

Going Beyond Perfect Secrecy

Obviously, perfect secrecy is too expensive in many practical situations since the quantity of data to be sent over the insecure channel is necessarily (at least) equal to that of the data to be secured. Modern encryption methods are thus more concerned withpractical security instead.

Essentially, a cryptographic system is considered to be practically secure when nocomputationally bounded adversary can recover meaningful information aboutM or Kfrom the sole knowledge of the ciphertextC. Most of the currently widely used block ciphers (such as the Advanced Encryption Standard [41]) are assumed to be practically secure (although in almost all cases, no strong mathematical proof of this is provided). Moreover, even in the case where perfect secrecy is not required, both ends still need to share the same secret key, which shall thus be transmitted to both end in a confidential way. This problem was solved with the invention of public key cryptography, as we will see in Chapter2.

1.4

Thesis Outline

In the rest of Part I, we will recall several notions concerning Encipherers, which we rather call symmetric encryption algorithms. In particular we explain in Chapter 2 how to determine the secret key length by computing the complexity of black box attacks (i.e., generic attacks that apply to any block cipher) and show how the problem of sharing this secret key is solved by means of public key cryptography. Almost all practical block cipher constructions follow either a Feistel scheme [50] (or a generalization of it), a Lai-Massey scheme [96], or a substitution-permutation network (SPN). We recall these three schemes in Chapter 3 following a top-down approach, detailing various of the smallest building blocks used within these schemes together with some of the essential properties they should have. We recall in Chapter 4 the Luby-Rackoff security model [102]. We introduce the notion of perfect cipher together with statistical attacks against block ciphers. In particular, we explain the difference between distinguishing attacks and key-recovery attacks (and see how to turn the first ones into the seconds), and recall linear cryptanalysis [110] which is a classical example of statistical cryptanalysis. The notations used throughout the rest of this thesis are introduced in Chapter5 as well as some elementary mathematical results.

In Part II we consider the (in)security of block ciphers against statistical crypt-analytic attacks and develop some tools to perform optimal attacks and quantify their efficiency. We do this step-by-step, starting by assuming in Chapter6 a simple setting in which the adversary has to distinguish between two sources of randomness in a set of reasonable cardinality. Through the method of types, we show how to derive the

optimal distinguisher limited toqsamples and compute its advantage, which we proved to be linked to the Chernoff information between the two probability distributions. Our

(21)

Section 1.4 Thesis Outline

treatment is not only valid when both distributions are of full support3 but also when

their respective supports differ. Then we consider the case where both distributions are “close” to each other, which is a situation of practical interest in cryptography. We then turn to a more complex problem (from the point of view of the adversary) where one of the two hypotheses is composite. Finally, we study the case where the adversary has to decide whether or not the samples follow some known distribution, and we derive her advantage in this case also. In Chapter7we consider the case where the cardinality of the samples’ set is too large to implement the optimal distinguisher. We introduceprojection-based distinguishers which typically compress the samples be-fore using them to decide between one hypothesis or another. Within this setting, we re-consider the particular case of linear distinguishers and generalize them to sets of arbitrary cardinality. We show how these distinguishers between random sources can be turned into distinguishers between random oracles (or block ciphers) in Chapter8 and how, in this setting, one can generalize linear cryptanalysis to Abelian groups. Using these theoretical tools, we show how to break TOY100 and introduce the block cipher

DEAN which encrypts blocks of decimal digits. We apply the theory to the SAFER

block cipher family in Chapter 9. Most of the theoretical tools introduced in this part are published in [7, 10], except for the generalization of linear cryptanalysis which is published in [8], along with the attacks onSAFER.

We introduce two new block cipher designs in Part III. We start by recalling some essential notions about provable security for block ciphers and about Serge Vau-denay’s Decorrelation Theory [155] in Chapter 10. Our contribution essentially relies on introducing new simple modules for which we prove essential properties that we will later use in our designs. In Chapter11 we introduce C, a block cipher provably secure against a wide range of cryptanalytic attacks, including linear and differential crypt-analysis (taking into account the linear hull effect [125] and the differentials effects, which is unfortunately almost never done in so-called traditional block cipher security proofs). In particular, we compute the exact advantage of the best distinguisher lim-ited to two plaintext/ciphertext samples betweenCand the perfect cipher and use it to compute the exact value of the maximum expected linear probability (resp. differential probability) ofCwhich is known to be inversely proportional to the number of samples required by best possible linear (resp. differential) attack. We conclude the chapter by implementation considerations. Since we are unable to prove any security result onC

concerning the bestq-limited distinguisher forq >3, we introduce the block cipherKFC

in Chapter 12, for which we indeed manage to prove security results for higher order adversaries. The block cipherC is published in [6], based on previous security results we obtained in [9]. The development of KFCis published in [5].

3The support of a finite distribution is the set of points on which the probability is non-zero. A distribution is of full support when its support corresponds to the whole sample space.

(22)
(23)

Chapter

2

Computationally Bounded Adversaries

In this chapter we show how two address two questions raised by Shannon’s encryption model, namely

what should be the typical key length (in bits for example) of a secure block cipher,

how one can transmit the secret key to both parties.

A block cipher on a finite set is a family of permutations on that set, indexed by a parameter called the key. More formally, letT andKbe two finite sets, respectively called the text space and the key space. A block cipher C on the text spaceT and key space K is a set of|K|permutations on T, i.e.,

C={Ck :T → T :k∈ K}.

To obtain a secure block cipher, it seems natural to require at least that the cardinality of K is large enough, for a reason that we will formalize here.

2.1

Black Box Attacks: Determining the Secret Key Length

Exhaustive Key Search

We first assume that the block cipher has no equivalent key, i.e., thatCk 6=Ck0

when k 6= k0 (otherwise, it suffices to keep in K exactly one representative of each

equivalence class). We consider the scenario where the adversary is given a plain-text/ciphertext pair (P, C), such that C = Cek(P) for some secret key ek ∈ K. The objective of the adversary is to recover ek. Probably the most basic strategy is to ex-haust all possible keys k and check whether C = Ck(P). If this is not the case, then

k is certainly not the key ek. If the equality holds, then the algorithm outputs k and stops. This is illustrated in Algorithm2.1. Assuming thatek=kσ(j) for somej and the permutation σ drawn on line 1, then it is clear that the algorithm succeeds if

(24)

Input: A plaintext/ciphertext pair (P, C)∈ T2 such thatC =C

e

k(P) for some

secret keyek∈ K={k1, k2, . . . , k|K|}

Output: A keyk

Selectσ uniformly at random among all permutations of{1,2, . . . ,|K|} 1: fori= 1,2, . . . ,|K|do 2: if C=C(i)(P)then return(i) 3: end 4:

Algorithm 2.1: Exhaustive search for the secret keyek∈ K.

for alli= 1,2, . . . , j−1. We denote bypthe probability of success. Sinceσis uniformly distributed, we have

p= 1 N

whereN denotes the number of keyskinKsuch thatC =Ck(P). We can approximate N by assuming that the |K| permutations defined by the block cipher are initially chosen (at the time of designingC) at random and in a uniform way so that, denoting

C? :T → T a uniformly distributed random permutation we have

N = max(1,|K|Pr[C?(P) =C]) = max(1,|K|/|T |). (2.1) Since in practice|K|and|T | are close to each other, a few pairs are sufficient to obtain an overwhelming probability of success.

Assuming that the algorithm succeeds using only one pair, the time complexity is clearly equal to the position ofekin the list{kσ(1), kσ(2), . . . , kσ(|K|)}. In the worst case, the complexity is|K|encryptions while on average it is (|K|+ 1)/2 since σ is uniformly distributed. In both cases this does not depend on the distribution ofek (thanks to the random selection ofσ). The memory complexity of Algorithm2.1 is clearly negligible.

Codebook Attack

The exhaustive key search algorithm requires no memory but has a tremendous time complexity. One can rather imagine storing all possible (Ck(P), k) pairs in a huge

table (for all possiblek and one chosenP, sorted according to the first entry), request the encryption ofP under the secret keyek, and perform one table look-up in order to recoverk. The time complexity is now negligible (except for the table pre-computation time) and the memory requirement is inO(|K|).

Time-Memory Trade-offs

Martin Hellman showed in [66] how to obtain a trade-off between time and memory complexities (a concept that was further refined by in [127]). Essentially, the method allows the reduction of both time and memory complexities to|K|2/3.

(25)

Section 2.2 New Directions in Cryptography: reducing Confidentiality to Authenticity Message Source T−1 K Decipherer Encipherer TK RNG inG

MessageM CryptogramC MessageM

Enemy Cryptanalyst RNG inG X Y gY gX K= (gX)Y =gXY K= (gY)X=gXY pow(g,·) pow(·,·) pow(g,·) pow(·,·)

Figure 2.1: Secret key exchange by means of an authenticated channel

Conclusion

What the black box methods show is thatK should be large enough in order for the time needed to encrypt |K| plaintext to be overwhelming. As a consequence, most of the current block ciphers use 128-bit or 256-bit keys whereas older ciphers used to have 64-bit (or even 56-bit) keys.

2.2

New Directions in Cryptography: reducing

Confidential-ity to AuthenticConfidential-ity

In their seminal article “New Directions in Cryptography” [47], Diffie and Hell-man explain how to build a confidential channel from an authentic channel. The way their construction integrates in Shannon’s model of secrecy is illustrated in Figure2.1. To simplify the description, we assume that the we are in the situation where Alice needs to send some confidential information to Bob. Let G be finite cyclic group and letg∈Gbe a (public) generator of this group. Alice and Bob respectively chooseXand Y uniformly at random inG, send gX andgY to each other through the insecure (but

authenticated) channel and both computeK =gXY. Without entering into the details

(for which we refer to [47, 157]), the Diffie-Hellman key agreement protocol is assumed to be secure whenever the channels on whichg, gX, and gY are sent areauthenticated

and as soon as it is computationally hard to solve the Diffie-Hellman Problem (DHP) in G, that is, given two inputs U, V G, compute K = gXY where X = log

gU and

Y = loggV. In particular, this problem is assumed to be hard inZ?p, wherep is a large prime number. We note that in practice, the secret key K will not be equal to gXY

(26)

but rather toh(gXY) whereh :G→ {0,1}n is a hash function andn is the secret key

length.

Since Diffie and Hellman, various other means of exchanging a common secret key by means of an authenticated channel were invented. In particular, any public key cryptosystem (such as RSA [132], ElGamal [49], Paillier cryptosystem [128], the Naccache-Stern cryptosystem [117] or Cramer-Shoup [38], to cite only a few) can be used.

(27)

Chapter

3

Block Ciphers Design: a Top-Down Approach

In this chapter we introduce typical block cipher designs. We first consider iterated block ciphers, which encompass almost all block ciphers widely used today, key schedules, and then consider three particular cases of iterated block ciphers, namely Feistel ciphers, ciphers based on the Lai-Massey scheme, and substitution-permutation networks.

It will then become evident that, whatever the kind of scheme, the building blocks used within it must have certain desirable properties. Finally, we detail the design of the Advanced Encryption Standard (AES) since the block cipher C that we introduce in Chapter11 is based on it.

3.1

Iterated Block Ciphers and Key Schedules

LetT andKrespectively be the text space and the key space of a block cipher

C={Ck :T → T :k∈ K}.

Letr >0 be a positive integer and letK1,K2, . . .Kr ber finite sets. C is said to be an

r-round iterated block cipher when it can be written as

Ck=R(krr)R(kr−r1)1 ◦ · · · ◦R(1)k1, (3.1)

for all k∈ K, where

R(i)={R(ki)

i :T → T :ki ∈ Ki}

is called the ith round of C. Of course, this definition is not completely sound since, according on it, there is not a clear unique way of expressing an iterated cipher. Usually, theith round of a block cipher is successively made of

a key-mixing phase, where the keyki is mixed to the data being encrypted,

a confusion phase (in the sense of [139]),

(28)

f

1

f

2

f

r

Figure 3.1: Anr-round Feistel scheme Ψ(f1,f2, . . . ,fr)

The last round often restricts to the key-mixing phase. Finally,k1, k2, . . . , kr are called

the round keys of the block cipher and are derived from the main secret keykby means of a deterministic algorithm called the key schedule. We will see that in most cases, the length of each round key is comparable to that of the main secret key, so that when this secret key is considered as a random variableK, the round keysK1, K2, . . . , Kr cannot

be independent.

3.2

Round Functions Based on Feistel Schemes

A Feistel scheme is a structure which allows to construct a permutation on 2n-bit strings based on functions of n-bit strings. An r > 0 rounds Feistel scheme based on the functions

f1,f2, . . . ,fr:{0,1}n→ {0,1}n,

is denoted Ψ(f1,f2, . . . ,fr) and is represented in Figure 3.1. It is easy to see that

Ψ(f1,f2, . . . ,fr) is invertible since

(29)

Section 3.4 Round Functions Based on Lai-Massey Schemes

To construct anr-round iterated block cipherC:{0,1}2n→ {0,1}2n(as in (3.1)) based

on anr-round Feistel scheme, one typically defines a family of functions

f=©fk :{0,1}n→ {0,1}n:k∈ K0

ª

and then let for allk∈ K0

R(i)(xleftkxright) = (xrightkxleftfki(xright)),

wherexleft (resp. xright) denotes the left-most (resp. right-most)nbits of the input x of the round. Usually, the last round does not permute the outputs (as in Figure3.1). In this way, the construction of a family of permutations on 2nbits reduces to that of a family of functions onnbits. Moreover, Luby and Rackoff showed in [102] that from a secure family of functions, one only needs three rounds to obtain a secure block cipher (this is more formally stated in Chapter10).

Practical examples of block ciphers based on a Feistel scheme include the Data Encryption Standard (DES) [122] and Blowfish [134]. The block cipher KFC that we introduce in Chapter12 is based on a three rounds Feistel scheme.

3.3

Round Functions Based on Lai-Massey Schemes

Like the Feistel scheme, the Lai-Massey scheme enables us to construct a per-mutation from functions. Anr rounds Lai-Massey scheme is represented in Figure 3.2. This scheme was developed by Xuejia Lai and James Massey during the design of the block cipher IDEA [94]. The particularity of the scheme is that it requires a commu-tative and associative law (which can be the exclusive-or operation or more complex group laws like inIDEA). As is, the Lai-Massey scheme is not secure even if the round functions are. The reason being that whatever the number of rounds, it is always true that

xleft¯xright=yleft¯yright,

where x = xleftkxright and y = yleftkyright respectively denote the input and the output of the scheme. To break this undesirable property, Vaudenay demonstrates in [153] that introducing a special (fixed) permutation σ at the output of each round left branch allows one to obtain security results equivalent to those of the Feistel scheme. The permutationσmust be such thatz7→σ(z)−zis also a permutation, in which case σ is called an orthomorphism.

Practical examples of block ciphers based on a Lai-Massey scheme include

IDEA[94] and FOX [76].

3.4

Round Functions Based on Substitution-Permutation

Networks

The last typical skeleton is probably the one which is closest to Shannon’s conception of encryption [139] since it consists of a sequence wherein a substitution

(30)

f

1

f

2

f

r

Figure 3.2: Anr-round Lai-Massey scheme

layer producing confusion is followed by a confusion layer producing diffusion. Although any block cipher can be seen as a substitution-permutation network, those based on the Feistel or the Lai-Massey schemes are usually not considered to be part of this category. The family of block ciphers SAFER[107,109] (which we cryptanalyse in Chap-ter9) and the Advanced Encryption Standard [41] (which we introduce in Section3.7

and on which we base the design of the block cipherC in Chapter11) are well known examples of substitution-permutation networks.

3.5

Providing Diffusion: on the Need for Multipermutations

According to Shannon, the diffusion process should “dissipate[the redundancy] into long range statistics” [139]. Yet, this definition leaves quite some space for inter-pretation. Schnorr and Vaudenay formalize in [137] the concept of multipermutation explaining what it technically means to provide good diffusion. Vaudenay further il-lustrates in [150] how fundamental this concept can be. In particular, he shows that if one replaces the substitution boxes of SAFERby other boxes then one obtains a weak

(31)

Section 3.7 Providing Confusion: Mixing key bits

block cipher in more than 6% of cases, the reason being that the diffusion of SAFER

is not a multipermutation. This is also a feature we exploit in the generalized linear cryptanalysis that we propose in Chapter9.

Definition 3.1 A (r, n)-multipermutation over an alphabet Z is a function f from Zr to Zn such that two different (r+n)-tuples of the form (x,f(x)) cannot collide in any

r positions.

Vaudenay notes that in the case wherefis linear, Definition 3.1 corresponds to MDS codes. For example, one of the core transformations of theAESdiffusion is based on a linear multipermutation (i.e., on an MDS code). In Chapter11 we take advantage of the inherent properties of MDS codes to prove certain security results concerning the block cipherC.

3.6

Providing Confusion: Mixing key bits

Providing confusion is usually done by applying a (fixed) substitution box to a mixing of key bits and of text bits. This is the case for theDES, theAESandSAFER. Probably the most well known counter-example is IDEA. Sometimes the confusion is created by key-dependent substitution boxes, which is the case for Blowfish [134] for example, where the key bits have the particularity to be mixed with text bits in an non-linear way. It seems natural to look for substitution boxes as similar as possible to uniformly distributed random permutations (or functions, depending on the case), as indicated by several security results that we manage to prove for both C and KFC

thanks to the ideal nature of the boxes we choose.

3.7

The Advanced Encryption Standard

As an example of substitution-permutation network, we introduce the encryp-tion part of the Advanced Encrypencryp-tion Standard [41]. TheAESis a 128-bit block cipher made of r = 10 rounds in the case where 128-bit keys are used1, all identical in their

structure (except the last one). Each round is parameterized by a round-key which is derived from the main 128 bits secret key using a so-called key schedule algorithm. The structure of each round is made of a (non-linear) substitution layer followed by a (linear) permutation layer.

A 128-bit plaintextpis considered as a 4×4 array of 8-bit elements (pi,j)1≤i,j≤4

with

p=p1,1kp2,1kp3,1kp4,1kp1,2k · · · kp4,4.

The firstr−1 first rounds successively apply top the following transformations: 1TheAES can also be used with 192 and 256-bit keys, in which cases the number of rounds are 12 and 14 respectively.

(32)

AddRoundKey performs an exclusive-or operation between the bits of p and the bits of the round keyk:

    a1,1 a1,2 a1,3 a1,4 a2,1 a2,2 a2,3 a2,4 a3,1 a3,2 a3,3 a3,4 a4,1 a4,2 a4,3 a4,4    =     p1,1 p1,2 p1,3 p1,4 p2,1 p2,2 p2,3 p2,4 p3,1 p3,2 p3,3 p3,4 p4,1 p4,2 p4,3 p4,4     M     k1,1 k1,2 k1,3 k1,4 k2,1 k2,2 k2,3 k2,4 k3,1 k3,2 k3,3 k3,4 k4,1 k4,2 k4,3 k4,4    

SubBytesapplies to each 8-bitai,j a fixed substitution boxS[·]:

    b1,1 b1,2 b1,3 b1,4 b2,1 b2,2 b2,3 b2,4 b3,1 b3,2 b3,3 b3,4 b4,1 b4,2 b4,3 b4,4    =     S[a1,1] S[a1,2] S[a1,3] S[a1,4] S[a2,1] S[a2,2] S[a2,3] S[a2,4] S[a3,1] S[a3,2] S[a3,3] S[a3,4] S[a4,1] S[a4,2] S[a4,3] S[a4,4]    

ShiftRows shifts each row of the 4×4 array by an offset which depends on the row number:     c1,1 c1,2 c1,3 c1,4 c2,1 c2,2 c2,3 c2,4 c3,1 c3,2 c3,3 c3,4 c4,1 c4,2 c4,3 c4,4    =     b1,1 b1,2 b1,3 b1,4 b2,2 b2,3 b2,4 b2,1 b3,3 b3,4 b3,1 b3,2 b4,4 b4,1 b4,2 b4,3    

MixColumns applies a linear multipermutation to each column of (ci,j)1≤i,j≤4.

Each 8-bit element is considered as a member of the finite field with 256 elements GF(28). The elements of this finite field are represented by polynomials of degree less than 8 with coefficients in GF(2), standard operations are performed modulo the irreducible polynomialx8+x4+x3+x+1, and any 8-bit elementb=b

7b6. . . b0

corresponds to the polynomial b7x7+b

6x6+· · ·+b0. With these notations, the

MixColumnsoperation on thejth column of (ci,j)i,j is:

    d1,j d2,j d3,j d4,j    =     0x02 0x03 0x01 0x01 0x01 0x02 0x03 0x01 0x01 0x01 0x02 0x03 0x03 0x01 0x01 0x02    ×     c1,j c2,j c3,j c4,j    

The last round of AES is identical to the r−1 previous ones, except that there no MixColumnsoperation. Finally, a lastAddRoundKeycompletes the algorithm.

(33)

Chapter

4

The Luby-Rackoff Model:

Statistical Attacks against Block Ciphers

In Chapter 2 we considered computationally bounded adversaries and used them to determine the length of the secret key of a typical block cipher. We also showed that public-key cryptography can be used to turn an authentic channel into an (expensive) confidential channel that can be used to exchange this secret key.

Conversely, an adversary in the Luby-Rackoff Model is assumed to be compu-tationally unbounded1 and only limited by the number of plaintext and/or ciphertext

samples she has access to.

4.1

The Perfect Cipher and Security Models

LetT andKrespectively be the text space and the key space of a block cipher

C={Ck :T → T :k∈ K}.

The block cipherC can be considered as a random permutation by simply considering the key K ∈ K is a random variable. Intuitively the perfect cipher should have no particular property common to each permutation that it defines. As a consequence, the perfect cipher

C?:T → T

is defined as a uniformly distributed random permutation onT. Obviously, the perfect cipher cannot be implemented for realistic block sizes, since the key length is propor-tional to log(|T |!). When studying the security of a block cipherC in the Luby-Rackoff model, one is essentially concerned with how easy it is to distinguishC from C?.

More formally, we consider an algorithm, called a distinguisher and denoted by A, that queries an oracle O which implements either a random instance of the block cipher C (an hypothesis that we denote H1 : O = C) or a random instance of

the perfect cipher (an hypothesis that we denote H0 : O = C?). The distinguisher

(34)

eventually outputs a bit to indicate which hypothesis betweenH0 and H1 is more likely

to be correct. The ability to distinguish between these two hypotheses is defined as the

advantage of the distinguisher and is defined by

AdvA(H0,H1) =|PrH1[A= 1]PrH0[A= 1]|,

which we also denote by AdvA(C,C?). The distinguisher is essentially limited by the number q of queries it can make to the oracle, so that A is usually referred to as a q-limited distinguisher. Furthermore, a distinguisher that can actually choose the ith query made to the oracle based on the answers of thei−1 previous ones is said to be

adaptive. A distinguisher which asks the q queries at once is said to be non-adaptive. Obviously, adaptive distinguishers are more powerful than non-adaptive distinguishers. We say that the block cipherC is resistant toq-limited (non-)adaptive distinguishers if any q-limited (non-)adaptive distinguisher Ahas a negligible advantage.

This security model is the one used by Michael Luby and Charles Rackoff in [102] to study the security of the Feistel scheme (on which theDES is based). This is also the model in which we prove security results for the block ciphersC and KFC

that we introduce in chapters11 and 12 respectively.

4.2

From Distinguishing to Key Recovery

Most of the concrete statistical cryptanalytic attacks against block ciphers implicitly assume the Luby-Rackoff model. Moreover, most of the well known attack categories (if not all) are non-adaptive. Within these, cryptanalysts generally distin-guish between known-plaintext attacks (KPA), in which the adversary has no control on which queries are made to the oracle, andchosen-plaintext attacks (CPA), in which the queries follow a certain distribution chosen by the adversary. For example, linear cryptanalysis [110,147] is a known-plaintext attack and differential cryptanalysis [21] is a chosen-plaintext attack.

The objective of a cryptanalytic attack can either be to distinguish between the two hypothesis mentioned in the previous section, namely H0 : O = C? and H1 :

O=C, or to recover the key that is used to encrypt the plaintext/ciphertext pairs that are available. In the rest of this section, we introduce a formalism close to Wagner’s unified view of block cipher cryptanalysis [163], which is based on Vaudenay’s model of statistical cryptanalysis [151]. We apply it within the scope of iterated ciphers and show why distinguishing attacks often lead to key recovery attacks.

Cryptanalytic attacks can be formalized using the notion of projection and

commutative diagrams. Consider an adversary performing a known plaintext attack againstr+ 1 rounds of an iterated block cipher

C={Ck:T → T :k∈ K}.

To emphasize the fact thatC is made ofr+ 1 rounds, we denote itC(r+1) and denote

(35)

Section 4.2 From Distinguishing to Key Recovery T ρ // C(r) ² ² X g ² ² T φ //Y

Figure4.1: A commutative diagram representing a distinguishing property onCr

round key (computed from the main keyk by means of the key schedule). To simplify the notations, we assume that all the rounds have the same structure, so that we simply denote any round by R.

In an ideal scenario, the adversary is able to find a distinguishing property

for the r first rounds of the cipher. More formally, we assume that the adversary has discovered two projections

ρ:T → X and φ:T → Y

(where X and Y typically are sets of small cardinality) and some function g :X → Y such that

g◦ρ=φ◦C(kr) (4.1)

holds for all keys k ∈ K. Assume also that this property is not trivial, i.e., not true in general if we replaceCk by C?. This can be represented by means of acommutative diagramas shown on Figure4.1, in which the facts that (4.1) holds and that the diagram commutes are equivalent. In such a case, the adversary can often mount a key recovery attack againstr+ 1 rounds of the block cipher by first guessing the last round keykr+1,

decrypting one round of the cipher for all the ciphertexts made available to her using her guessek ofkr+1, and finally checking whether

g◦ρ=φ◦C(kr+1)Re1

k (4.2)

holds. When her guess is correct, i.e., when ek=kr+1, then (4.2) is equivalent to (4.1),

so that it will always hold. When ek 6= kr+1 then we can consider that the

adver-sary is actually performing an additional one-round encryption of all the ciphertexts. Consequently, we can (abusively) consider that the adversary checks whether

g◦ρ=φ◦C(kr+2) (4.3) holds in this case. As the distinguishing property was assumed to be non trivial, there is no particular reason why (4.3) should hold, so that the adversary will easily check that her guess is incorrect as (4.3) is likely to be false for several plaintext/ciphertext pairs.

In practical attacks, it is usually only necessary to guess some bits of the last round key in order to check the distinguishing property, the remaining bits being

(36)

recovered by exhaustive search. Once kr+1 is recovered the adversary can peel-off

an entire round of the block cipher and iterate the whole process (usually, once a distinguishing property can be found for a certain number of rounds, a distinguishing property on fewer rounds is easy to find). In certain cases, recovering the last round key can be sufficient to recover the keyk.

As a distinguishing attack onC(kr)often leads to a key recovery onC(kr+1), from now on we only consider distinguishing attacks, i.e., attacks aiming at finding some non trivial distinguishing property on the block cipher. We illustrate these notions by introducing a concrete example, namely linear cryptanalysis.

4.3

Linear Cryptanalysis

Linear cryptanalysis is a known-plaintext attack proposed by Matsui in [110] to break the DES [122], based on concepts introduced by Tardy-Corfdir and Gilbert in [147]. It assumes that the plaintexts are independent and uniformly distributed in the text spaceT ={0,1}n, and consider linear (in the sense of GF(2)) binary projections

of the form

ρ(P) =a•P =a0P0⊕a1P1⊕ · · · ⊕an−1Pn−1 ∈ {0,1},

wherea∈ {0,1}n is called amask. Essentially, linear cryptanalysis aims at finding an

input maskaand an output maskb onr rounds of an iterated cipherC, such that (a•P)(b•C(r)

k (P)) = 0 (4.4)

holds with a probability far distant from 12 for all keysk∈ K. More precisely, if one let

1

2 +² be the probability that the linear relation (4.4) holds, then the efficiency of the

cryptanalysis based on it is known to depend on thelinear probability coefficient [32] LPa,b(Ck) = LP ³ (a•P)(b•C(r) k (P)) ´ = 4²2 where thelinear probability of a random bit B is defined by

LP(B) = (2Pr[B = 0]1)2 =¡E¡(1)B¢¢2.

The linear probability is often assumed to be close to the expected linear probability ELPa,b(C) = EK(LPa,b(CK)),

an hypothesis referred to as thehypothesis of stochastic equivalence (a concept formal-ized by Lai [94, 97]).

In practice, to derive a linear relation such as (4.4) on an iterated cipher made ofr rounds, the cryptanalyst first derives adequate linear relations on each round of the block cipher, such that the output mask of round i−1 is equal to the input mask of roundi. This forms a so called characteristic (a0, a1, . . . , ar). Using Matsui’s piling-up

lemma, which states that for two independent random bitsB1 and B2 we have

(37)

Section 4.3 Linear Cryptanalysis

the cryptanalyst then usually assumes that ELPa0,ar(C)

r

Y

i=1

ELPai−1,ai(R). (4.5)

This strategy is the one adopted by Matsui in his cryptanalysis of the DES. In that particular case, the experiments justify the approximations [72, 111].

Yet, Nyberg shows in [125] that the right hand-side of (4.5) essentially under-estimates the true expected linear probability since, in the case of Markov ciphers [97] (see Definition 8.8) we actually have

ELPa0,ar(C) = X a1,...,ar−1 r Y i=1 ELPai−1,ai(R),

a property which is often referred to as the linear hull effect. We emphasize the fact that since the approximation (4.5) underestimates the true value of the expected linear probability, it also underestimates the efficiency of the cryptanalysis. Whereas this is perfectly acceptable from the point of view of the adversary (since the attack can only perform better than expected), it is unfortunate to see the same approximation made in so-called security proofs of block ciphers. For example, the maximum value (over all non-zero input/output masks) of the expected linear probability over 8 rounds of theAES was initially assumed to be less than 2300 [41, pp.30–31], which is obviously

wrong: since for any input mask a, the sum over all the 2128 values of ELP

a,b(AES)

is equal to 1, at least one must be greater than 2128. Yet, in that particular case,

Keliher proves that the maximum value of the ELP’s can be bounded by 1.778·2107

for 8 or more rounds [79–81]. In the cases of the block ciphers C and KFC that we introduce in chapters11 and12 respectively, we manage to compute the exact value of the expected linear probability (taking the linear hull effect into account) for various number of rounds.

Other examples of statistical cryptanalytic attacks include differential crypt-analysis (which is a chosen plaintext attack introduced by Biham and Shamir in [23]), several of its variants (such as truncated differentials [88], impossible differentials [18] or higher order differentials [88,95]), Vaudenay’sχ2 cryptanalysis [62,151], and integral attacks [69, 93]. An exhaustive review is provided by Junod in [73].

(38)
(39)

Chapter

5

Notations and Elementary Results

We introduce in this last chapter the notations that will be used throughout as well as some elementary results.

5.1

Random Variables, Probabilities, Strings, etc.

IfZ is a finite set, we denote by|Z|its cardinality. LetP denote a probability distribution over the finite set Z. We denote the fact that a random variable X is drawn according to the distributionPbyX PorX←− ZP in the case of an algorithm. The probability that X takes a particular value a ∈ Z is either denoted by PrP[a], Pr[X=a], or P[a], where in the last case the probability distribution is simply seen as a vector in [0,1]|Z|. The support of P is the subset of Z made of all elements a such

thatP[a]6= 0 and is denoted by supp(P). The distributionPis said to be offull support

if supp(P) = Z. If A and B denote some random events such that Pr[A]>0, we will denote Pr[B|A] or PrA[B] the probability of the event B given the occurrence of the

eventA.

Ifz1, z2, . . . , zq ∈ Zareqelements ofZ, we denote byzq = (z1, z2, . . . , zq)∈ Zq

the vector ofZ havingzias itsith component. We adopt a similar notation for random variables. If Z1, Z2, . . . , Zq ∈ Z are q independent and identically-distributed (i.i.d.)

random variables drawn according to distributionP, we denote byPd the distribution

of Zq= (Z

1, Z2, . . . , Zq) so that

Pr[Z1 =z1, Z2 =z2, . . . , Zq=zq] = Pr[Zq=zq] = PrPq[zq] =Pq[zq].

We note that since the random variables are assumed to be independent, it always holds thatPq[zq] =Qq

i=1P[zi].

The set of all functionsF from the finite setZ toRis a vector space of finite dimension thus all norms k · k on this set define the same topology. The open ball of radius ² >0 around f0 ∈ F is the set (f0) = {f ∈ F : kf−f0k < ²}. Anopen set

is a union of open balls. The interior of a set Π is the union of all open sets included in Π and is denoted by Π. The closed ball of radius ² > 0 around f0 ∈ F is the set

References

Related documents