Eﬀect of padding in low-area architectures

15.3 Comparison of SAMOSA and Grøstl in terms of the hardware performance

15.3.3 Eﬀect of padding in low-area architectures

SAMOSA has a simpler padding unit. The diﬀerence is shown in Figure 13 for the case of byte padding,i.e.,padding of messages that end on a boundary of a byte. The elimination of Block Counter reduces the complexity of the control unit as well as the area associated with the padding logic. This reduction, although relatively minor for high-speed implementations, may prove to be quite signiﬁcant for low area implementations.

15.4 Comparison of SAMOSA with the SHA-3 ﬁnalists

Tables 5 and 6 present the comparison betweenSAMOSAand the SHA-3 ﬁnalists using the best single-message architecture,i.e., architecture capable of processing only one message at a time. All algorithms have been implemented without padding units, in two variants, with 256-bit and 512-bit output, in Xilinx Virtex 5 and Altera Stratix III FPGAs. The primary metric used for comparison is throughput to area ratio. All results, other than the results forSAMOSA, are based on [16].

In terms of the throughput to area ratio, SAMOSA performs consistently better than BLAKE, Grøstl and Skein, and loses only to Keccak and JH in both 256-bit and 512-bit variants. Additionally, it reduces the gap in performance to Keccak and JH as compared to Grøstl. It also outperforms Skein in the 512-bit variant on Xilinx Virtex 5, where Grøstl loses to Skein. Furthermore, due to its similarity to Grøstl, SAMOSA has an additional advantage compared to other SHA-3 candidates when resource sharing with the Advanced Encryption Standard (AES) is possible, as demonstrated in [36].

16 Conclusion and Open Problems

This paper gives proposal for a novel permutation based hash mode of operation named FP. Our indiﬀerentiability security analysis establishes that the new mode is secure against all generic attacks up to approximately 2n/2 queries; more interestingly, our experimental results, based on randomly generatedreconstructiongraphs using C programs, suggest that the security bound can be improved to nearly 2n queries (n is the hash size in bits). We leave the proof of this improved result as an open problem.

Table 5: SAMOSA and the best single message architectures of the SHA-3 ﬁnalists for the 256-bit variants of hash functions

Xilinx Virtex 5 Altera Stratix III

Ranking Architecture Throughput Area TP/Area Ranking Architecture Throughput Area TP/Area

(Mbits/s) (CLB slices) (Mbits/s) (ALUTs)

Keccak x1 13337 1369 9.74 Keccak x1 15493 3531 4.39 JH x1 4955 982 5.05 JH x1 5276 3221 1.64 SAMOSA x1 5516 1305 4.23 SAMOSA x1 5969 4851 1.23 Grøstl x1 (P/Q) 6117 1795 3.41 Grøstl /2(v) (P/Q) 3818 3914 0.98 Skein x4 3023 1218 2.48 Skein x4 2475 3943 0.63 BLAKE /4(v)/4(h) 389 231 1.68 BLAKE /2(h) 2158 3553 0.61

Table 6: SAMOSA and the best single message architectures of the SHA-3 ﬁnalists for the 512-bit variants of hash functions

Xilinx Virtex 5 Altera Stratix III

Ranking Architecture Throughput Area TP/Area Ranking Architecture Throughput Area TP/Area

(Mbits/s) (CLB slices) (Mbits/s) (ALUTs)

Keccak x1 7612 1320 5.77 Keccak x1 8526 3471 2.46 JH x1 4686 992 4.72 JH x1 5011 3288 1.52 SAMOSA x1 7133 2559 2.79 SAMOSA x1 8298 9810 0.85 Skein x4 3084 1418 2.17 Grøstl /2(v) (P/Q) 5262 7763 0.68 Grøstl /2(v) (P/Q) 4816 2336 2.06 Skein x4 2438 4006 0.61 BLAKE /4(v)/4(h) 560 386 1.45 BLAKE /2(h) 2928 6977 0.42

We also design a concrete hash function familySAMOSAbased on theFPmode and the P permutations of the SHA-3 finalist Grøstl; we claim it is hard to attackSAMOSA with complexities significantly less than the brute force. Our FPGA hardware implementations of SAMOSA show remarkable improvement in the throughput to area ratio compared to the SHA-3 finalists Grøstl, BLAKE and Skein. It is still not known how efficientSAMOSA is in software. We leave the software implementations of SAMOSA as future work.

Acknowledgments

The authors like to thank Dustin Moody and Daniel Smith-Tone for numerous helpful discussions.

References

[1] Elena Andreeva, Bart Mennink, and Bart Preneel. The parazoa family: generalizing the sponge hash functions. Int. J. Inf. Sec., vol. 11, Number 3, pp. 149–165, 2012. (Cited on page 6.)

[2] ATHENa Project Website,http://cryptography.gmu.edu/athena (Cited on pages 33, 34 and 35.)

[3] Guido Bertoni, Joan Daemen, Micha¨el Peeters, and Gilles Van Assche. Sponge Func- tions. ECRYPT 2007, 2007. http://sponge.noekeon.org/SpongeFunctions.pdf. Ac- cessed March 2012. (Cited on page 6.)

[4] Guido Bertoni, Joan Daemen, Micha¨el Peeters, and Gilles Van Assche. On the Indif- ferentiability of the Sponge Construction. In Nigel P. Smart, editor, EUROCRYPT, volume 4965 of Lecture Notes in Computer Science, pages 181–197. Springer, 2008. (Cited on page 6.)

[5] Rishiraj Bhattacharyya and Avradip Mandal. On the Indiﬀerentiability of Fugue and Luﬀa. In Javier Lopez and Gene Tsudik, editors,ACNS, volume 6715 ofLecture Notes in Computer Science, pages 479–497, 2011. (Cited on page 6.)

[6] Rishiraj Bhattacharyya, Avradip Mandal, and Mridul Nandi. Security Analysis of the Mode of JH Hash Function. In Seokhie Hong and Tetsu Iwata, editors, FSE, volume 6147 ofLecture Notes in Computer Science, pages 168–191. Springer, 2010. (Cited on page 6.)

[7] Eli Biham and Orr Dunkelman. A framework for iterative hash functions – HAIFA. Second NIST Cryptographic Hash Workshop, 2006, 2006. (Cited on page 5.)

[8] Alex Biryukov, Orr Dunkelman, Nathan Keller, Dmitry Khovratovich, and Adi Shamir. Key Recovery Attacks of Practical Complexity on AES-256 Variants with up to 10 Rounds. In Henri Gilbert, editor, EUROCRYPT, volume 6110 of Lecture Notes in Computer Science, pages 299–319. Springer, 2010. (Cited on page 5.) [9] John Black, Phillip Rogaway, and Thomas Shrimpton. Black-Box Analysis of the

Block-Cipher-Based Hash-Function Constructions from PGV. In Moti Yung, editor, CRYPTO, volume 2442 of Lecture Notes in Computer Science, pages 320–335. Springer, 2002. (Cited on page 5.)

[10] Simon R. Blackburn, Douglas R. Stinson, and Jalaj Upadhyay. On the complexity of the herding attack and some related attacks on hash functions. Des. Codes Cryptog- raphy, 64(1-2):171–193, 2012. (Cited on page 5.)

[11] Christophe De Canni`ere, Hisayoshi Sato, and Dai Watanabe. The Luﬀa Hash Function. The 1st SHA-3 Candidate Conference. (Cited on page 6.)

[12] CERG Group, George Mason University: Hardware Inter- face of a Secure Hash Algorithm (SHA), available on-line at

http://cryptography.gmu.edu/athena/index.php?id=interfaces (Cited on page 34.)

[13] Jean-S´ebastien Coron, Yevgeniy Dodis, C´ecile Malinaud, and Prashant Puniya. Merkle-Damg˚ard Revisited: How to Construct a Hash Function. In Victor Shoup,

editor, CRYPTO 2005, volume 3621 of Lecture Notes in Computer Science, pages 430–448. Springer, 2005. (Cited on pages 5 and 9.)

[14] Yevgeniy Dodis, Leonid Reyzin, Ronald L. Rivest, and Emily Shen. Indiﬀerentiability of Permutation-Based Compression Functions and Tree-Based Modes of Operation, with Applications to MD6. In Orr Dunkelman, editor, FSE, volume 5665 of Lecture Notes in Computer Science, pages 104–121. Springer, 2009. (Cited on page 6.) [15] Ewan Fleischmann, Michael Gorski, and Stefan Lucks. Some Observations on In-

diﬀerentiability. In Ron Steinfeld and Philip Hawkes, editors, ACISP, volume 6168 of Lecture Notes in Computer Science, pages 117–134. Springer, 2010. (Cited on page 10.)

[16] Kris Gaj, Ekawat Homsirikamol, Marcin Rogawski, Rabia Shahid, and Malik Umar Sharif. Comprehensive Evaluation of High-Speed and Medium-Speed Implementations of Five SHA-3 Finalists Using Xilinx and Altera FPGAs. Cryptology ePrint Archive, Report 2012/368, 2012, available online at http://eprint.iacr.org/2012/368.pdf

(Cited on pages 8, 33, 34 and 39.)

[17] Praveen Gauravaram, Lars R. Knudsen, Krystian Matusiewicz, Florian Mendel, Chris- tian Rechberger, Martin Schl¨aﬀer, and Søren S. Thomsen. Grøstl - a SHA-3 candidate. The 1st SHA-3 Candidate Conference. (Cited on pages 6 and 31.)

[18] Ekawat Homsirikamol, Marcin Rogawski, and Kris Gaj. Throughput vs. Area Trade- oﬀs in High-Speed Architectures of Five Round 3 SHA-3 Candidates Implemented Using Xilinx and Altera FPGAs. LNCS 6917, Cryptographic Hardware and Embedded Systems workshop, CHES 2011, Nara, Japan, Sep. 28-Oct. 1, pp. 491-506. (Cited on page 34.)

[19] J´er´emy Jean, Mar´ıa Naya-Plasencia, and Thomas Peyrin. Improved Rebound At- tack on the Finalist Grstl. FSE 2012, Washington DC, March 19-21, 2012,

http://www.di.ens.fr/~jean/pub/fse2012.pdf (Cited on page 32.)

[20] Antoine Joux. Multicollisions in Iterated Hash Functions: Application to Cascaded Constructions. In Matthew K. Franklin, editor, CRYPTO 2004, volume 3152 of Lec- ture Notes in Computer Science, pages 306–316. Springer, 2004. (Cited on page 5.) [21] John Kelsey and Tadayoshi Kohno. Herding Hash Functions and the Nostradamus

Attack. In Serge Vaudenay, editor, EUROCRYPT, volume 4004 of Lecture Notes in Computer Science, pages 183–200. Springer, 2006. (Cited on page 5.)

[22] John Kelsey and Bruce Schneier. Second Preimages on n-Bit Hash Functions for Much Less than 2n Work. In Ronald Cramer, editor, EUROCRYPT 2005, volume 3494 of Lecture Notes in Computer Science, pages 474–490. Springer, 2005. (Cited on page 5.)

[23] Dmitry Khovratovich. Bicliques for permutations: collision and preimage attacks in stronger settings. Cryptology ePrint Archive, Report 2012/141,

http://eprint.iacr.org/2012/141 (Cited on page 32.)

[24] Özgül Kü¸cük. Design and Analysis of Cryptographic Hash Functions. PhD thesis, KU Leuven, 2012. http://www.iacr.org/phds/?p=detail&entry=777. (Cited on page 6.)

[25] Kashif Latif, M Muzaﬀar Rao, Arshad Aziz, and Athar Mahboob, Eﬃcient Hard- ware Implementations and Hardware Performance Evaluation of SHA-3 Finalists, The Third SHA-3 Candidate Conference, Washington, D.C., March 22-23, 2012. (Cited on pages 33 and 34.)

[26] Stefan Lucks. A failure-friendly design principle for hash functions. In Bimal K. Roy, editor, ASIACRYPT, volume 3788 ofLecture Notes in Computer Science, pages 474–494. Springer, 2005. (Cited on page 5.)

[27] Ueli M. Maurer, Renato Renner, and Clemens Holenstein. Indiﬀerentiability, impos- sibility results on reductions, and applications to the random oracle methodology. In TCC, pages 21–39, 2004. (Cited on page 10.)

[28] Dustin Moody, Souradyuti Paul and Daniel Smith-Tone. Indiﬀerentiability Security of the Fast Widepipe Hash: Breaking the Birthday Barrier. Cryptology ePrint Archive, Report 2011/630. (Cited on page 11.)

[29] Dustin Moody, Souradyuti Paul and Daniel Smith-Tone. Improved Indiﬀerentiability Security Bound for the JH Mode. 3rd SHA-3 Candidate Conference, 2012. (Cited on pages 6 and 11.)

[30] NIST. Secure hash standard. InFederal Information Processing Standard, FIPS 180-2, April 1995. (Cited on page 5.)

[31] Mridul Nandi and Souradyuti Paul. Speeding up the wide-pipe: Secure and fast hashing. In Guang Gong and Kishan Chand Gupta, editors, INDOCRYPT, volume 6498 ofLecture Notes in Computer Science, pages 144–162. Springer, 2010. (Cited on page 6.)

[32] Christian Rechberger. Grøstl Update. 3rd SHA-3 Candidate Conference, 2012, Wash- ington DC, USA. (Cited on page 32.)

[33] Thomas Ristenpart, Hovav Shacham, and Thomas Shrimpton. Careful with Composi- tion: Limitations of the Indiﬀerentiability Framework. In Kenneth G. Paterson, editor, EUROCRYPT, volume 6632 of Lecture Notes in Computer Science, pages 487–506. Springer, 2011. (Cited on page 10.)

[34] Ron Rivest. The MD5 message-digest algorithm. In IETF RFC 1321, 1992. (Cited on page 5.)

[35] Ron Rivest. The MD6 Hash Function. (Cited on page 6.)

[36] Marcin Rogawski and Kris Gaj, A High-Speed Hardware Architecture for AES and the SHA-3 Candidates Grøstl, 15th EUROMICRO Conference on Digital System Design Architectures, DSD 2012. (Cited on page 39.)

[37] Martin Schläffer. Updated Differential Analysis of Grøstl. January 2011,

http://www.groestl.info/groestl-analysis.pdf (Cited on page 32.)

[38] SHA-3 Zoo Hardware Implementations,

http://ehash.iaik.tugraz.at/wiki/SHA-3_Hardware_Implementations (Cited on

page 33.)

[39] Hongjun Wu. The JH Hash Function. The 1st SHA-3 Candidate Conference (2009). (Cited on page 6.)

[40] Shuang Wu, Dengguo Feng, Wenling Wu, Jian Guo, Le Dong, and Jian Zou. (Pseudo) Preimage Attack on Round-Reduced Grøstl Hash Function and Others (Extended Ver- sion). Cryptology ePrint Archive, Report 2012/206,http://eprint.iacr.org/2012/206

(Cited on page 32.)

A

Deﬁnitions

Deﬁnition A.1 (Random oracle) A random oracle is a function RO:X → Y chosen uniformly at random from the set of all|Y||X| functions that mapX →Y. In other words, a function RO : X → Y is a random oracle if and only if, for each x ∈ X, the value of

RO(x) is chosen uniformly at random from Y.

B

Time costs of

FullGraph

and the simulator-pair (S,

S

−1

)

Since there areiqueries afterirounds, the maximum number of nodes inTsafteriround is i2. Therefore, to constructTs in thei-th round, the amount of time required by FullGraph isO(i4). Now, if the adversary submitsσ queries, then the time complexity of FullGraphis

O(σ5). Since the time of FullGraph dominates over the other costs such as MessageRecon, the worst-case simulator time complexity of (S,S−1) is alsoO(σ5).

C

Proof of

(8)

(8) is as follows: Pr[AG1 ⇒1|GOOD1σ ] = Pr[AG2⇒1|GOOD1σ ] .

Let V_i1 and V_i2 denote the views of the systems G1, and G2 respectively, after i queries have been processed. To prove (8), it suﬃces to show that given GOOD1σ, the views Vσ1 andV2

σ are identically distributed. We do this by induction on the number of queriesi=σ. Induction Hypothesis: GivenGOOD1i,Vi1 and Vi2 are identically distributed.

Base: When i= 0, then no query has been made; therefore the hypothesis is true.

Induction Step: Now assume the induction hypothesis holds. We have to show that if GOOD1i+1 occurred, then Vi1+1 and Vi2+1 are identically distributed.

Let (I_i1₊₁, O_i1₊₁) and (I_i2₊₁, O2_i₊₁) denote the input-output pairs for the systems G1 and G2 respectively in thei+ 1st round. Note that the induction hypothesis implies thatV_i1 and V_i2are identically distributed givenGOODioccurred. Also note thatVi1+1 =Vi1||Ii1+1||Oi1+1, and V_i2₊₁ =V_i2||I_i2₊₁||O2_i₊₁.

A little reﬂection shows that proving the induction step is equivalent to proving the following proposition.

Proposition C.1 (Proof of Induction Step) GivenGOOD1i+1 andVi1=Vi2 1. the input-views I_i1₊₁ and I_i2₊₁ are identically distributed;

2. if I_i1₊₁ =I_i2₊₁ then the output-views O1_i₊₁ and O_i2₊₁ are identically distributed.

Proof.

1. This result is easy sinceV1 i =Vi2.

2. To prove this, we ﬁrst establish the following lemma which is the main ingredient in our proof.

Lemma C.2 The reconstruction graphsTsof the systems G1 and G2 are isomorphic after irounds, given GOODi and Vi1 =Vi2.

Proof. For a freshπ- orπ−1-query, the graphTπ of system G1 is augmented in one phase (see the subroutinePartialGraph of Figure 5). In that phase, all possible nodes generated from a freshπ-query are added to the graphTπ. A straightforward analysis of the Type1 events shows that if these events do not occur then no nodes can be added beyond this phase. In other words, if Type1 events do not occur in i rounds then the Tπ is a full

reconstruction graph for Dπ. Consequently, Ts – which is a connected subgraph of Tπ rooted atIV, IV′ – is afull reconstruction graph forDs.

We note that the graphTs for G2 is also afull reconstruction graph forDs.

SinceV₁i=V₂i, the graphsTs forG1 and G2 are isomorphic afterirounds. 2 LetIi+1denote the shared query inputI₁i+1=I₂i+1. We continue by considering all possible cases based on a set of conditions for the system G1 in thei+ 1st round. Our decision tree produced 17 cases, which have been derived from a sequence of questions (see Figure 14): Cases 1 through 9 consider when Ii+1 is an s-query, cases 10 through 11 consider when Ii+1 is ans−1-query, while cases 12 through 17 consider whenIi+1 is the round input of a long query.

Case 1: s-query, Fresh, |M|= 0.

Implication. The condition directly implies that O_i1₊₁ follows the uniform distribution

U[0,22n−1], since a Type0 event did not occur in thei+ 1st round. Since the graphsTs are isomorphic in both systems G1 and G2 by Lemma C.2,|M|= 0 for G2. This implies thatO2_i₊₁ follows the uniform distribution U[0,22n−1].

Case 2: s-query, Old, Type Q1, Q2, Q4, or Q5-2,|M|= 0.

Implication. This case is impossible since GOOD1i+1 implies that Type2 event did not occur for G1 in the current i+ 1st round.

Case 3: s-query, Old, Type Q5-1,|M|= 0.

Implication. Note that Type Q5 query must be the ﬁnal π-query of some long query M. The M can be present in Tπ as two diﬀerent branches. In the present case, all the inter- mediate queries of the branch – that represents M – are Q0. Now, note that, if M= 0, then this case is not possible. The other scenario has been considered in Case 2.

Case 4: s-query, Old, Type Q3,|M|= 0.

Implication. The event GOOD1i+1 implies that Type2 event did not occur for G1 in the current i+ 1st round; therefore, since |M| = 0, O1_i₊₁ follows the uniform distribution

U[0,22n−1]. As the graphsTs are isomorphic in both systems G1 and G2 by Lemma C.2,

|M|= 0 for G2. This implies thatO_i2₊₁follows the uniform distributionU[0,22n−1], since thes-query is fresh for G2.

Case 5: s-query, |M|>1.

Implication. |M| > 1 implies node collision in Ts which is impossible since GOOD1i+1 ensures that Type1 event did not occur for G1 in the previous i rounds forbidding the occurrence of node collision inTs.

Case 6: s-query, Fresh, |M|= 1.

Query-type? |M_|=? 0 ₁ Impossible (Not Type1) >1

s-query Long query

Final message- block? Empty Outputs no yes In Tπ? yes no In Ts? Random Outputs (Not Type0,Not Type1) yes no Identical Outputs Random Outputs (Not Type0) 1. Inputs I1 =I2 2. Isomorph.of Ts’s 3. V1 =V2 Fresh? Identically Distributed Outputs (Not Type2) yes no Query type? Random Outputs (Not Type2) Q5-1 Q3 Impossible (Not Type2) Random Outputs (Not Type0) Fresh? Identical Distribution (Not Type2) yes no Query type? Random Outputs (Not Type2) Q5-1 Q3 Q1,Q2, Q4,Q5-2 Impossible (Not Type2) type ofred branch? I II III Impossible

(Not Type3) Random Outputs (Not Type3) (Not Type3)Impossible 1 2 3 4 5 6 7 ₈ 9 12 13 14 15 16 17 s-1_-query Fresh? yes no Impossible (Not Type4) Random Outputs (Not Type0) Q1,Q2, Q4,Q5-2 10 11

Figure 14: The decision tree for the proof of Proposition C.1. The conditions for the system G1 are shown inside the diamonds of the decision tree. The leaf-node shows the implications of the conditions to the outputs of systems G1 and G2.

for G1,M ∈ Mimplies thatM /∈Dom(Dl) in the ﬁrstirounds, since the currents-query Ii+1 is fresh. Also note thatV_i1 =V_i2 and the isomorphism ofTs’s together imply thatDl in both systems are identical. Therefore, for G2 too, M /∈Dom(Dl) in the ﬁrst irounds. This implies thatO2_i₊₁ follows the uniform distribution U[0,22n−1].

Case 7: s-query, Old, Type Q1, Q2, Q4, or Q5-2,|M|= 1.

Implication. This case is impossible since GOOD1i+1 implies that Type2 event did not occur for G1 in the current i+ 1st round.

Case 8: s-query, Old, Type Q5-1,|M|= 1.

Implication.The event GOOD1i+1 implies that Type2 event did not occur in the i+ 1st round of G1; therefore, O1_i₊₁[0, n−1] follows the uniform distribution U[0,2n−1], and O_i1₊₁[n,2n −1] is a ﬁxed value. Now, for G1, M ∈ M implies that M ∈ Dom(Dl) after the ﬁrst i rounds, since the current s-query Ii+1 is of type Q5-1; also note that O_i1₊₁[n,2n−1] =Dl[M]. As in the previous case, Vi1 =Vi2 and the isomorphism of Ts’s together imply thatDl in both systems are identical. Therefore,O2i+1[n,2n−1] =Dl[M]; also note that O_i2₊₁[0, n−1] follows the uniform distribution U[0,2n−1]. In conclusion, O_i1₊₁ and O_i2₊₁ are identically distributed.

Case 9: s-query, Old, Type Q3,|M|= 1.

Implication. The event GOOD1i+1 implies that Type2 event did not occur for G1 in the currenti+ 1st round; therefore, since |M|= 1, O1

i+1 follows the uniform distribution

U[0,22n−1]. Since the graphsTsare isomorphic in both systems G1 and G2 by Lemma C.2, this implies thatO2_i₊₁ is the output of a freshs-query, and therefore it follows the uniform distributionU[0,22n₋_1].

Case 10: s−1-query and Fresh.

Implication. The condition implies thatOi₁+1follows the uniform distributionU[0,22n₋_1], since Type0 event did not occur in the current i+ 1st round. Because V₁i = V₂i+1, we have that the s−1-query is also a fresh query for G2. Therefore, Oi₂+1 follows the uniform

In document A Novel Permutation-based Hash Mode of Operation FP and the Hash Function SAMOSA (Page 39-49)