Asymptotic notation has been in use (with some variations) since the late 19th century and is an essential tool in analyzing algorithms and data structures. The core idea is to represent the resource we’re analyzing (usually time but sometimes also memory) as a function, with the input size as its parameter. For example, we could have a program with a running time of T(n) = 2.4n + 7.
An important question arises immediately: what are the units here? It might seem trivial whether we measure the running time in seconds or milliseconds or whether we use bits or megabytes to represent problem size. The somewhat surprising answer, though, is that not only is it trivial, but it actually will not affect our results at all. We could measure time in Jovian years and problem size in kg (presumably the mass of the storage medium used), and it will not matter. This is because our original intention of ignoring implementation details carries over to these factors as well: the asymptotic notation ignores them all! (We do normally assume that the problem size is a positive integer, though.)
What we often end up doing is letting the running time be the number of times a certain basic operation is performed, while problem size is either the number of items handled (such as the number of integers to be sorted, for example) or, in some cases, the number of bits needed to encode the problem instance in some reasonable encoding.
Forgetting. Of course, the assert doesn’t work. (http://xkcd.com/379)
4
For an “out-of-the-box” solution for inserting objects at the beginning of a sequence, see the black box sidebar on
■Note Exactly how you encode your problems and solutions as bit patterns usually has little effect on the asymptotic running time, as long as you are reasonable. For example, avoid representing your numbers in the unary number system (1=1, 2=11, 3=111…).
The asymptotic notation consists of a bunch of operators, written as Greek letters. The most important ones (and the only ones we’ll be using) are O (originally an omicron but now usually called “Big Oh”), Ω (omega), and Θ (theta). The definition for the O operator can be used as a foundation for the other two. The expression O(g), for some function g(n), represents a set of functions, and a function f(n) is in this set if it satisfies the following condition: there exists a natural number n0 and a positive
constant c such that f(n) ≤cg(n)
for all n≥n0. In other words, if we’re allowed to tweak the constant c (for example, by running the
algorithms on machines of different speeds), the function g will eventually (that is, at n0) grow bigger
than f. See Figure 2-1 for an example.
This is a fairly straightforward and understandable definition, although it may seem a bit foreign at first. Basically, O(g) is the set of functions that do not grow faster than g. For example, the function n2 is
in the set O(n2), or, in set notation, n2∈O(n2). We often simply say that n2 is O(n2).
The fact that n2 does not grow faster than itself is not particularly interesting. More useful, perhaps,
is the fact that neither 2.4n2 + 7 nor the linear function n does. That is, we have both
2.4n2 + 7 ∈O(n2) and n∈O(n2).
cn
2T(n)
n
014
The first example shows us that we are now able to represent a function without all its bells and whistles; we can drop the 2.4 and the 7 and simply express the function as O(n2), which gives us just the
information we need. The second shows us that O can be used to express loose limits as well: any function that is better (that is, doesn’t grow faster) than g can be found in O(g).
How does this relate to our original example? Well, the thing is, even though we can’t be sure of the details (after all, they depend on both the Python version and the hardware you’re using), we can describe the operations asymptotically: the running time of appending n numbers to a Python list is O(n), while inserting n numbers at its beginning is O(n2).
The other two, Ω and Θ, are just variations of O. Ω is its complete opposite: a function f is in Ω(g) if it satisfies the following condition: there exists a natural number n0 and a positive constant c such that
f(n) ≥cg(n)
for all n≥n0. So, where O forms a so-called asymptotic upper bound, Ω forms an asymptotic lower
bound.
■Note Our first two asymptotic operators, O and Ω, are each others’ inverses: if f is O(g), then g is Ω(f ). Exercise
1–3 asks you to show this.
The sets formed by Θ are simply intersections of the other two, that is, Θ(g) = O(g) ∩Ω(g). In other words, a function f is in Θ(g) if it satisfies the following condition: there exists a natural number n0 and
two positive constants c1 and c2 such that
c1g(n) ≤f(n) ≤c2g(n)
for all n≥n0. This means that f and g have the same asymptotic growth. For example, 3n2 + 2 is Θ(n2), but
we could just as well write that n2 is Θ(3n2 + 2). By supplying an upper bound and a lower bound at the
same time, the Θ operator is the most informative of the three, and I will use it when possible.