Solving Problems
—ALBERT EINSTEIN
A. Sub-point of idea II B Sub-point of idea II.
4.4 DECOMPOSITION: OTHER USES
Decomposition is certainly an essential skill for software design, but design is far from the only way that decomposition is used in computing. Modern computers often contain multiple processors, also referred to as cores. These multicore machines allow for different programs, or different parts of the same program, to execute simultaneously. The idea that multi- ple pieces of software execute simultaneously, also known as multitasking, requires that the software be decomposed. This multitasking is restricted to a handful of cores in most laptop computers but extends to tens of thousands of processors in today’s supercomputers.
A related concept, called grid computing, makes use of the Internet. Grid computing uses many networked computers to attack the same prob- lem. A computer program decomposes the problem in such a way that each computer can work on solving one part of the problem at the same time that other computers solve other parts.
The kind of decomposition described Section 4.3, like that of multitask- ing involves dividing a program into separate instructions or groups of instructions. Data decomposition is another use of the concept of divide and conquer. In fact, data decomposition is so important that there is an entire field of computing known as data organization.
As an example of decomposing data, we examine two search algorithms. A search algorithm is a method for examining a group of data items in order to find an item with some particular property.
The most intuitive of all search algorithms is linear search. A linear search requires that the group of items be arranged one after another from first to last. The algorithm consists of examining the first item, then the second, the third, the fourth, and so on, until the desired item has been found.
As an example of a linear search, suppose you were given a stack of award certificates and told that one of the awards might be awarded to you. A linear search would proceed as follows looking for your certificate in a stack of 500:
Step 1 Check the name on the top certificate, if not your name then pro- ceed to Step 2.
Step 2 Check the name on the 2nd certificate, if not your name then pro- ceed to Step 3.
Step 3 Check the name on the 3rd certificate, if not your name then pro- ceed to Step 4.
…
Step 500 Check the name on the 500th certificate.
If you find your name on a certificate, you can stop at that step. In the worst case (when your certificate is last or not included) you would need to examine all 500 certificates.
Now suppose that the stack of award certificates is sorted alphabeti- cally with the beginning of the alphabet at the top of the stack. In this case we could use a different algorithm known as binary search. Each step of a binary search examines the middle item of the remaining group. By comparing the middle item to the sought item, half of the group can be eliminated from further consideration in this search. To see why consider that your name is Jones, Susan and the name in the middle is Smith, John. Since the stack has been alphabetized and since Jones precedes Smith alphabetically, there is no need to consider the half of the data from Smith to the end. Similarly, if the middle name was Gomez, Marc, then the first half of the data (through Gomez) can be eliminated from further consideration. A binary search of the previous
(alphabetized) stack of 500 scholarship certificates would proceed as follows:
Step 1 Check the name on the middle certificate in the stack. If the mid- dle certificate alphabetically precedes your name, then set aside the top half of the stack. If the middle certificate alphabetically follows your name, then set aside the bottom half of the stack. Step 2 Check the name on the middle certificate in the remaining stack.
If the middle certificate alphabetically precedes your name, then set aside the top half of the remaining stack. If the middle certifi- cate alphabetically follows your name, then set aside the bottom half of the remaining stack.
Repeat Step 2 until either your certificate is found or the remaining stack has only one certificate that is not yours.
As mentioned earlier, a disadvantage of binary search is that it requires data that is sorted. However, the payoff is that binary search is usually faster. A linear search of 500 scholarship certificates can require as many as 500 steps, while the most steps for binary search is 9. Figure 4.13 justi- fies this claim.
Both linear search and binary search involve decomposition of data in the sense that they divide the data into individual parts to be searched, but the way that this division occurs is quite different. Linear search removes one item from the remaining group of items for each step of the algorithm, whereas binary search removes roughly half of the group at each step. This distinction in the way that divide and conquer is applied leads to the sig- nificantly improved performance of binary search. For more discussion of how decomposition is useful for data, Chapter 7 considers more data organizational techniques.