Where can you get random numbers if you don’t have an RNG? It turns out there are algorithms called pseudo-random number generators
(PRNGs). Just as there are algorithms that convert plaintext into cipher- text, there are algorithms that produce what are called “pseudo-random” numbers.
If you use one of these algorithms to generate a few thousand numbers and apply the statistical tests, the numbers pass. What makes these num- bers pseudo-random and not random is that they are repeatable. If you install the same PRNG on another computer, you get the same results. If you run the program two weeks later, you get the same results.
This is one reason we say that numbers that pass statistical tests of randomness are “probably” random. Even if they pass, do we know whether they are repeatable? The math tests give us only part of the answer.
If the numbers are repeatable, what good is a PRNG? The answer is that you can change the output by using what is known as a seed. Just as RNGs take input (radioactive decay, atmospheric conditions, electrical variances), a PRNG takes input (the seed). If you change the input, you change the output. With RNGs, the input is constantly changing on its own, unpredictably. With a PRNG, it’s up to you to make sure the input changes each time you want to generate new numbers.
What is this seed? In the real world, a seed can be lots of things: the time of day down to the millisecond, various constantly changing com- puter state measurements, user input, and other values. Maybe you’ve seen a user-input seed collector. An application may ask you to move the mouse around. At selected intervals, the program looks at where, on the screen, the arrow is located. This value is a pair of numbers: how many pixels up from the bottom of the screen and how many pixels over from the left. Any one input is not sufficient, but if you put them all together you have unpredictability (see Figure 2-6).
You may be thinking, “Why use a PRNG to generate the numbers? Why not just use the seed?” There are two main reasons. The first reason is the need for speed. Seed collection is often time-consuming. Suppose you need
Chapter 2
28
TEAM
FLY
only a few thousand bits of random data. A seed collector may take several minutes to gather the necessary numbers. When was the last time you waited several minutes for a program to do something without getting frustrated? To save time, you can gather 160 or so bits of seed (which may take little time), feed it to the PRNG, and get the required thousands of bits in a few milliseconds.
The second reason to use a PRNG is entropy, a term that describes chaos. The greater the entropy, the greater the chaos. To put it another way, the more entropy, the more random the output. Suppose you want 128 bits of entropy. A seed may have that, but it is spread over 2,400 bits. For example, the time of day down to the millisecond is represented in 64 bits. But the year, the month, the date, and maybe even the hour and minute might be easy to guess. The millisecond—two or three bits of the time of day—is where the entropy is. This means that out of 64 bits of seed, you have 2 bits of entropy. Similarly, your other seed data may suf- fer the same condition. A PRNG will take that 2,400 bits of seed and com- press it to 128 bits.
Well, then, why not take the seed and throw away the low-entropy bits? In a sense, that’s what a PRNG does. You can do it, or you can have a PRNG do it, and the latter means less work for you.
A random number generator (left) collects unpredictable information and converts it into random numbers. A pseudo-random number generator (right) collects seed information and converts it into numbers that pass statistical tests of random- ness but can be repeated
By the way, most PRNGs use message digests to do the bulk of the work. We talk about the details of digests in Chapter 5, but for now, let’s just say that they are the “blenders” of cryptography. Just as a blender takes recognizable food and purees it into a random, unrecognizable blob, a message digest takes recognizable bits and bytes and mixes them up into a random, unrecognizable blob. That sounds like what we look for in a PRNG.
A good PRNG always produces pseudo-random numbers, regardless of the seed. Do you have a “good” seed (one with lots of entropy)? The PRNG will produce numbers that pass tests of randomness. Do you have a “bad” seed (or no seed at all)? The PRNG will still produce good numbers that pass the tests.
Then why do you need a good seed? The answer is given in the next section.