A good (keyed) hash function will appear to be a pseudorandom function (fam- ily). For each key, the hash function will appear to be an independent random function.
The theoretical foundations for the MD6 compression function presented in Section 6.1 support the hypothesis that MD6 is a pseudorandom function family, since that section argues for the hypothesis that the MD6 compression function is indistinguishable from a random oracle, (under the assumptions given in Section 6.1.2), and if you declare part of the input of a random oracle to be “key” and the remainder to be “input” (as MD6 does), you end up with a pseudorandom function family.
Conversely, any adversary that could distinguish MD6 (as keyed) from a pseudorandom function family could be correctly interpreted as an adversary for distinguishing the MD6 compression function fQ from a random oracle.
We know of no effective attacks for distinguishing (keyed) MD6 compression function from a pseudorandom function family with the same key, input, and output sizes.
The following subsections discuss our attempts to distinguish MD6 from a pseudorandom function family by various statistical tests.
6.6.1
Standard statistical tests
A good hash function will pass any standard statistical tests for randomness, such as the NIST Statistical Test Suite. This section reports our efforts to distin- guish the MD6 compression function from a random oracle using two statistical
CHAPTER 6. COMPRESSION FUNCTION SECURITY 95
test suites: the NIST Statistical Test Suite, and TestU01.
Both of these statistical test suites are designed to test pseudorandom num- ber generators. We created a pseudorandom number generator from MD6 by running MD6 in counter mode. Let hr,d denote the MD6 hash function pa-
rameterized with with digest size d and r rounds. hr,d(m) is the d-bit output
of the parameterized MD6 function on input m. Given a 64-bit seed s, our PRNG generates the sequence hr,d(s)||hr,d(s + 1)||.... We report the results of
the chosen statistical tests for d = 512, s = 0 and various choices of r.
These statistical tests fail to detect any nonrandomness in MD6 when MD6 has 11 or more rounds.
6.6.1.1 NIST Statistical Test Suite
The NIST Statistical Test Suite is available from http://csrc.nist.gov/groups/ ST/toolkit/rng/index.html. The test suite contains implementations of 15 different randomness tests.
We ran the NIST statistical tests on MD6 for all r ∈ [0, 35]. For each r, we generated 1000 sequences of 1 million bits. Every test from the NIST Test Suite was run against every bit sequence, generating a list of p-values for every test. To determine if a test was successful at distinguishing MD6 from a random number generator, we tested the output p-values for uniformity using a Kolmogorov- Smirnov test. We also compared the actual number of statistically-significant p-values to the expected number. Running the NIST suite for all r ∈ [0, 35] took approximately 10 days.
To our surprise, a few of the NIST tests found MD6 non-random beyond 25 rounds. We believe, however, that these tests are incorrectly implemented. We tested this hypothesis by running the NIST tests on SHA-1 (with the full number of rounds), and found that several of the NIST tests also deem SHA-1 non-random.
The following table lists the maximum number of rounds of MD6 that each NIST test is able to distinguish from a random oracle. We have included the results of the tests we believe to be correctly implemented, i.e. the tests passed by SHA-1.
SHA-1 failed the FFT, OverlappingTemplate, RandomExcursionsVariant and Serial tests, and we have omitted the results for MD6 on these tests. We replicated these four tests with the TestU01 suite, however, and found MD6 passes these tests for r ≥ 9. The following section describes our experiments with TestU01 in more detail.
6.6.1.2 TestU01
We also ran MD6 through the TestU01 suite of tests for random number gen- erators. TestU01 was developed by Pierre L’Ecuyer and Richard Simard and is available from http://www.iro.umontreal.ca/~simardr/testu01/tu01.html. Further details on TestU01 are available in [38].
Test Name r0
Runs 8
Frequency 8
Maurer’s Universal Statistic 7
Rank 7 Longest Run 7 Block Frequency 8 Approximate Entropy 8 Non-overlapping Template 8 Linear Complexity 4 Cumulative Sums 9 Random Excursions 6
Table 6.1: Table showing the maximum number r0 of rounds of MD6 each of
the NIST tests can distinguish from random.
TestU01 contains a wide variety of statistical tests organized into several test batteries. We selected three of the provided batteries, SmallCrush, Crush and BigCrush, to test MD6. We also created a test battery to mimic the failing NIST tests by following recommendations in the TestU01 documentation. We ran these tests on MD6 for 0 < r ≤ 20, to determine how many rounds of MD6 these tests can distinguish from random.
The SmallCrush test battery runs in a few minutes, and all of the tests in the SmallCrush test battery pass for r ≥ 9. The Crush battery requires approximately 8 hours per round tested. All of the tests in the Crush test battery pass for r ≥ 11. The only test that fails for r = 10 is the LongestHeadRun test. The BigCrush test takes approximately 4 days to run, so we only ran it for r = 10 and r = 11. MD6 passes the BigCrush battery for both r = 10 and r = 11.
Our NIST test battery contains the following three tests: • sspectral Fourier1,
• smultin MultinomialBitsOver, and • swalk RandomWalk1 .
These tests are similar to the four NIST tests SHA-1 failed. We ran the test smultin MultinomialBitsOver and the test swalk RandomWalk1 on 1000 1000000-bit strings, that is, under the same conditions as the NIST tests. MD6 passes these tests for r ≥ 9. We ran the TestU01 sspectral Fourier1 test on 46 1000000-bit strings, as the test is only valid for a small number of input bit strings. Under these conditions, MD6 passes the sspectral Fourier1 test for r ≥ 9. All three of these tests run in a few hours.
CHAPTER 6. COMPRESSION FUNCTION SECURITY 97
standard statistical tests for pseudorandom number generators cannot distin- guish MD6 from a random oracle when MD6 has 11 or more rounds.
6.6.2
Other statistical tests
This section reports our efforts to distinguish the MD6 compression function from a random oracle using other statistical tests we have devised or adapted from the literature.
We created several influence tests to measure correlations between input and output bit positions. Given an input bit position b and an output bit position c, our tests measure pbc, the probability that flipping bit b causes bit c to flip.
For a random oracle, we expect pbc= 12 for any choice of b and c.
Our tests measure pbcby using the following procedure:
1. Choose an input bit position b, an output bit position c, a number of rounds r, and a number of trials n.
2. Use RC4 to generate n random 89-word inputs x1, x2, ..., xn to the MD6
compression function f .
3. For each input xi, generate a corresponding input x0i by flipping the bth
bit of xi.
4. Compare the outputs of the r-round compression function fr(x0i) and
fr(xi). Compute the output difference o = fr(x0i) ⊕ fr(xi).
5. Count cbc, the number of times o has a 1 in output position c. Compute
pbc= cnbc.
For a given input bit b, our influence test simultaneously computes pbc for
all output bits c. To test if MD6 behaves significantly differently from a ran- dom oracle, we compared the distribution of the measured pbc values with the
expected distribution from a random oracle. In a random oracle, the pbcvalues
are drawn from a binomial distribution with p =1
2 and n trials. For large n, this
distribution is essentially normal with mean1
2 and standard deviation 1 2√n. We
used an Anderson-Darling test and a chi-square test to measure the probability of our pbcvalues being drawn from this theoretical distribution.
Table 6.2 reports the results of running the above test on input bit 3648 with 1000000 trials. We chose this bit because it is one of the worst-case inputs; it is the first bit of the 57th input word, which is the last word to be incorporated into the hash function computation. Therefore, we expect the influence of bits in the 57th word to be more detectable than the influence of bits in any other word.
An Anderson-Darling A∗2 score above 1.035 is significant at the P = 0.01 level. Our χ2test results in a Z-score, for which a score above 2.57 is significant at the P = 0.01 level. These results show that the basic influence test cannot differentiate between MD6 and a random oracle beyond 10 rounds. Running this test required only a few minutes.
r Mean Standard Deviation Anderson-Darling A∗2 χ2 Z-Score 1 0.00000 0.000000 DNE 22627394.37055 2 0.00098 0.022112 DNE 22583200.31969 3 0.00880 0.070245 DNE 22284642.00478 4 0.04094 0.120308 269.07229 20383323.51708 5 0.10784 0.165396 124.15361 16395698.58622 6 0.19551 0.187628 42.07427 11577960.41165 7 0.32541 0.179635 29.50064 5679459.06660 8 0.43923 0.110198 123.03832 1433321.38957 9 0.49252 0.028513 262.46513 78630.41306 10 0.49984 0.001440 138.43952 167.39026 11 0.50001 0.000515 0.29285 1.43492 12 0.50002 0.000495 0.84664 -0.40540 13 0.50001 0.000506 0.47437 0.59054 14 0.50000 0.000509 0.30570 0.79043 15 0.50001 0.000494 0.25221 -0.54828 16 0.49998 0.000488 0.31075 -1.01786 17 0.49998 0.000503 0.63886 0.33177 18 0.49998 0.000507 0.51248 0.64595 19 0.50001 0.000497 0.22655 -0.29475
Table 6.2: Table of test statistics for the influence test on reduced-round MD6 compression functions. A∗2> 1.035 or χ2> 2.57 are significant at the P = 0.01
level.
We ran another influence test called the dibit influence test. This test is similar to the previous influence test, except it operates on pairs of adjacent bits, called dibits. A dibit can take on 4 values: 00, 01, 10, 11. The intuition behind this test is that adjacent bits tend to stay together through the MD6 compression function, so adjacent input bits may have undue influence on adjacent output bits.
For this test, we treated the input to the compression function as an array of 89 words of 32 dibits. We followed the same procedure as the previous influence test, except we created 3 inputs from each random input xi (since there are
3 possible ways an input can differ in one dibit position). In the final step, we counted each of the 4 possible output difference patterns independently, resulting in 4 output counts, c00, c01, c10 and c11. For large n, the c values of
a random oracle are normally distributed with mean n
4 and standard deviation
q
3
16n. We again used an Anderson-Darling test and a χ
2 test to compare the
measured c values to the expected distribution.
Table 6.3 reports the results of the first dibit of the 57th word, for the same reason for using the first bit of the 57th word in the standard influence tests. The following table shows the most and least significant dibit pairs for each of the included rounds.
CHAPTER 6. COMPRESSION FUNCTION SECURITY 99
An Anderson-Darling A∗2 score above 1.035 is significant at the P = 0.01 level. Our χ2test results in a Z-score, for which a score above 2.57 is significant at the P = 0.01 level. Our results show that the dibit influence test cannot dif- ferentiate between MD6 and a random oracle beyond approximately 10 rounds. Although there are some statistically significant results for beyond 10 rounds, around 2.4 statistically significant scores are expected since there are a total of 240 measurements for each statistic. Running this test took only a few minutes. Our work with the influence test and its variants shows that these tests cannot distinguish MD6 from a random oracle beyond 10 rounds.