How a Statistical Test Works - Testing Randomness

6.5 Testing Randomness

6.5.1 How a Statistical Test Works

A statistical test is formulated to test a specific null hypothesis (H0) that is always associated

with an alternative hypothesis (Ha). In order to test randomness, the null hypothesis is that

the sequence being tested is random and the alternative hypothesis is that the sequence is not random. The result of the applied statistical test is the acceptance or rejection of the null hypothesis. If the conclusion of the applied test is to accept the null hypothesis it means that the generator is producing random values, based on the tested data. Otherwise, if the conclusion is to reject the null hypothesis, this means that the generator is not producing random values.

For each test, a relevant randomness statistic must be chosen and used to determine the acceptance or rejection of the null hypothesis. Under an assumption of randomness, the statistic chosen has a distribution of possible values and from this distribution, a critical value is determined (typically 99%). During a test, a test statistic value is computed on the data and then, this value is compared to the critical value. If it is greater than the critical value, the null hypothesis for randomness is rejected. Otherwise, the null hypothesis is not rejected, i.e., the null hypothesis is accepted. If the randomness assumption is, in fact, true for the data evaluated by the statistical test, the value of the calculated result will have a very low probability of exceeding the critical value, i.e. the null hypothesis has a low probability of being rejected. From a statistical hypothesis testing point of view, the low probability event should not naturally occur and, therefore, if the calculated statistic test value exceeds the critical value, i.e., if the low probability event occurs, the conclusion is that the original assumption of randomness is suspect or faulty.

6.5. Testing Randomness 53 Concluding, statistical hypothesis testing may give two possible outcomes: accept H0 (con-

cluding the data is random) or reject H0 (concluding the data is non-random).

However, a statistical test can give an outcome that does not conclude the real information about the data.

H0 is true H0 is false

Accept H0 Right conclusion Type II error

Reject H0 Type I error Right conclusion

Table 6.1: True status of the data available for analysis and the conclusion arrived by the usage of the testing procedure. An important observation is that the status of the data available for analysis is unknown in almost all the cases, in fact, that is why the statistical tests are made.

Table 6.1 illustrates the possible errors to occur. There are basically two types of possible errors: when H0 is true but the statistical test outcomes its rejection (Type I error) and when

Ha is true but the statistical test outcomes the acceptance of H0 instead of its rejection (Type

II error).

In this study case, where the null hypothesis is the data to be random, the correct conclusions are: accept H0 when the tested data is really random, and reject H0 when the tested data is

non-random.

The probability of a Type I error to occur is called the test significance level, denoted by α, and it can be set prior to a test. Thus, α indicates the probability for the test to indicate that the sequence is not random when it really is random. In cryptography, the common value of α is chosen in [0.001, 0.01] range.

The probability of a Type II error to occur is denoted as β. Therefore, β denotes the probability that the test will indicate that the sequence is random when it is not. This can occur, for example, when a "bad" generator produces a sequence that appears to have random properties. Unlike α, β is not a fixed value. In fact, β can take many different values because there are an infinite number of ways that a data stream can be non-random, and each one of those ways yields a different probability for the sequence to appear to have random properties. The calculation of β is more difficult than the calculation of α because of the many possible types of non-randomness. However, the probabilities α and β are related to each other and to the size, n, of the tested sequence in such a way that if two of them are specified, the third value is automatically determined (usually the selected parameters to be specified are n and α, and then a critical point for a given statistic is selected and the smallest β will be produced).

One of the primary goals of the statistical tests is to minimize the probability of a Type II error to occur.

Each test is based on a calculation of a test statistic value. If the test statistic value is S and the critical value is t, then:

• Type I error probability:

54 Chapter 6. Randomness • Type II error probability:

P (S ≤ t || H0 is false) = P (accept H0| H0is false)

The test statistic is used to calculate a P-value. A P-value is the probability of a perfect random number generator to produce a sequence less random than the sequence that was tested. Thus, the P-value summarizes the strength of the evidence against the null hypothesis. If a P- value for a test is determined to be equal to 1, the sequence appears to have perfect randomness and, thus, if a P-value is equal to 0, the sequence appears to be completely non-random. If P-value≥ α, the null hypothesis is accepted. If P-value < α, the null hypothesis is rejected.

An α = 0.01 indicates that one would expect one sequence in 100 sequences to be rejected. For a P-value ≥ 0.01, a sequence would be considered to be random with a confidence level of 99.9%. For a P-value < 0.01, a sequence would be considered to be non-random with a confidence level of 99.9%.

In document Public keys quality (Page 69-71)