Computing with Numbers
3.4 Limitations of Computer Arithmetic
It’s sometimes suggested that the reason “!” is used to represent factorial is because the function grows very rapidly. For example, here is what happens if we use our program to find the factorial of 100:
Please enter a whole number: 100
The factorial of 100 is 9332621544394415268169923885626670049071596826 43816214685929638952175999932299156089414639761565182862536979208272237 58251185210916864000000000000000000000000
That’s a pretty big number!
Although recent versions of Python have no difficulty with this calculation, older versions of Python (and modern versions of other languages such as C++ and Java) would not fare as well.
For example, here’s what happens in several runs of a similar program written using Java.
# run 1
Please enter a whole number: 6 The factorial is: 720
# run 2
Please enter a whole number: 12 The factorial is: 479001600
# run 3
Please enter a whole number: 13 The factorial is: 1932053504
3.4. Limitations of Computer Arithmetic 55 This looks pretty good; we know that6! = 720. A quick check also confirms that 12! = 479001600.
Unfortunately, it turns out that13! = 6227020800. It appears that the Java program has given us an incorrect answer!
What is going on here? So far, I have talked about numeric data types as representations of familiar numbers such as integers and decimals (fractions). It is important to keep in mind, however, that computer representations of numbers (the actual data types) do not always behave exactly like the numbers that they stand for.
Remember back in Chapter 1 you learned that the computer’s CPU can perform very basic op-erations such as adding or multiplying two numbers? It would be more precise to say that the CPU can perform basic operations on the computer’s internal representation of numbers. The problem in this Java program is that it is representing whole numbers using the computer’s underlying int data type and relying on the computer’s addition operation for ints. Unfortunately, these machine ints are not exactly like mathematical integers. There are infinitely many integers, but only a finite range of ints. Inside the computer, ints are stored in a fixed-sized binary representation. To make sense of all this, we need to look at what’s going on at the hardware level.
Computer memory is composed of electrical “switches,” each of which can be in one of two possible states, basically on or off. Each switch represents a binary digit or bit of information. One bit can encode two possibilities, usually represented with the numerals0 (for off) and 1 (for on). A sequence of bits can be used to represent more possibilities. With two bits, we can represent four things.
Three bits allow us to represent eight different values by adding a zero or one to each of the four two-bit patterns.
You can see the pattern here. Each extra bit doubles the number of distinct patterns. In general,n bits can represent2ndifferent values.
The number of bits that a particular computer uses to represent an int depends on the design of the CPU. Typical PCs today use 32 or 64 bits. For a 32 bit CPU, that means there are232possible
56 Chapter 3. Computing with Numbers values. These values are centered at 0 to represent a range of positive and negative integers. Now
232
2 = 231. So, the range of integers that can be represented in a 32 bit int value is −231 to231− 1.
The reason for the−1 on the high end is to account for the representation of 0 in the top half of the range.
Given this knowledge, let’s try to make sense of what’s happening in the Java factorial example.
If the Java program is relying on a 32-bit int representation, what’s the largest number it can store.
Python can give us a quick answer.
>>> 2**31-1 2147483647
Notice that this value (about 2.1 billion) lies between 12! (about 4.8 million) and 13! (about 6.2 billion). That means the Java program is fine for calculating factorials up to 12, but after that the representation “overflows” and the results are garbage. Now you know exactly why the simple Java program can’t compute 13! Of course that leaves us with another puzzle. Why does the modern Python program seem to work quite well computing with large integers.
At first, you might think that Python uses the float data type to get us around the size limitation of the ints. However, it turns out that floats do not really solve this problem. Here is an example run of a modified factorial program that uses floating point numbers.
Please enter a whole number: 30
The factorial of 30 is 2.6525285981219103e+32
Although this program runs just fine, after switching to float, we no longer get an exact answer.
A very large (or very small) floating point value is printed out using exponential, or scientific, notation. The e+32 at the end means that the result is equal to2.6525285981219103 × 1032. You can think of the +32 at the end as a marker that shows where the decimal point should be placed.
In this case, it must move 32 places to the right to get the actual value. However, there are only 16 digits to the right of the decimal, so we have “lost” the last 16 digits.
Remember, floats are approximations. Using a float allows us to represent a much larger range of values than a 32-bit int, but the amount of precision is still fixed. In fact, a computer stores floating point numbers as a pair of fixed-length (binary) integers. One integer represents the string of digits in the value, and the second represents the exponent value that keeps track of where the whole part ends and the fractional part begins.
Fortunately, Python has a better solution for large, exact values. A Python int is not a fixed size, but expands to accommodate whatever value it holds. The only limit is the amount of memory the computer has available to it. When the value is small, Python can just use the computer’s underlying int representation and operations. When the value gets larger, Python automatically converts to a representation using more bits. Of course, in order to perform operations on larger numbers, Python has to break down the operations into smaller units that the computer hardware is able to handle. Sort of like the way you might do long division by hand. These operations will not be as efficient (they require more steps), but they allow our Python ints to grow to arbitrary size. And that’s what allows our simple factorial program to compute some whopping large results.
This is a very cool feature of Python.