Debugging by Deduction - FFIRS 08/25/ :31:15 Page 2

The process ofdeduction proceeds from some general theories or premises, using the processes of elimination and refinement, to arrive at a conclusion (the location of the error), as shown in Figure 8.4.

As opposed to the process of induction in a murder case, for example, where you induce a suspect from the clues, using deduction, you start with a set of suspects and, by the process of elimination (the gardener has FIGURE 8.3 An Example of Clue Structuring.

a valid alibi) and refinement (it must be someone with red hair), decide that the butler must have done it. The steps are as follows:

1. Enumerate the possible causes or hypotheses. The first step is to de- velop a list of all conceivable causes of the error. They don’t have to be complete explanations; they are merely theories to help you struc- ture and analyze the available data.

2. Use the data to eliminate possible causes. Carefully examine all of the data, particularly by looking for contradictions (you could use Figure 8.2 here), and try to eliminate all but one of the possible causes. If all are eliminated, you need more data gained from additional test cases to devise new theories. If more than one possible cause remains, select the most probable cause—the prime hypothesis—first.

3. Refine the remaining hypothesis. The possible cause at this point might be correct, but it is unlikely to be specific enough to pinpoint the error. Hence, the next step is to use the available clues to refine the theory. For example, you might start with the idea that ‘‘there is an error in handling the last transaction in the file’’ and refine it to ‘‘the last transaction in the buffer is overlaid with the end-of-file indicator.’’ 4. Prove the remaining hypothesis. This vital step is identical to step 4 in

the induction method.

5. Fix the error. Again this step is identical to step 5 in the induction method. To re-emphasize though, you should thoroughly test your fix to ensure it does not create problems elsewhere in the application.

As an example, assume that we are commencing the function testing of the DISPLAY command discussed in Chapter 4. Of the 38 test cases FIGURE 8.4 The Deductive Debugging Process.

C08 08/17/2011 1:8:15 Page 165

identified by the process of cause-effect graphing, we start by running four test cases. As part of the process of establishing input conditions, we will initialize memory that the first, fifth, ninth, . . . , words have the value 000; the second, sixth, . . . , words have the value 4444; the third, seventh, . . . , words have the value 8888; and the fourth, eighth, . . . , words have the value CCCC. That is, each memory word is initialized to the low-order hexadecimal digit in the address of the first byte of the word (the values of locations 23FC, 23FD, 23FE, and 23FF areC).

The test cases, their expected output, and the actual output after the test are shown in Figure 8.5.

Obviously, we have some problems, since apparently none of the test cases produced the expected results (all were successful). But let’s start by debugging the error associated with the first test case. The command indi- cates that, starting at location0(the default),Elocations (14 in decimal) are to be displayed. (Recall that the specification stated that all output will contain four words, or 16 bytes per line.)

Enumerating the possible causes for the unexpected error message, we might get:

1. The program does not accept the wordDISPLAY. 2. The program does not accept the period.

3. The program does not allow a default as a first operand; it expects a storage address to precede the period.

4. The program does not allow anEas a valid byte count. FIGURE 8.5 Test Case Results from the DISPLAY Command.

The next step is to try to eliminate the causes. If all are eliminated, we must retreat and expand the list. If more than one remain, we might want to examine additional test cases to arrive at a single error hypothesis, or proceed with the most probable cause. Since we have other test cases at hand, we see that the second test case in Figure 8.5 seems to eliminate the first hypothesis; and the third test case, although it produced an incorrect result, seems to eliminate the second and third hypotheses.

The next step is to refine the fourth hypothesis. It seems specific enough, but intuition might tell us that there is more to it than meets the eye—it sounds like an instance of a more general error. We might contend, then, that the program does not recognize the special hexadecimal charac- tersA–F. This absence of such characters in the other test cases makes this sound like a viable explanation.

Rather than jumping to a conclusion, however, we should first consider all of the available information. The fourth test case might represent a totally different error, or it might provide a clue about the current error. Given that the highest valid address in our system is7FFF, how could the fourth test case display an area that appears to be nonexistent? The fact that the displayed values are our initialized values, and not garbage, might lead to the supposition that this command is somehow displaying some- thing in the range0–7FFF. One idea that may arise is that this could occur if the program were treating the operands in the command asdecimal values rather than hexadecimal, as stated in the specification. This is borne out by the third test case: Rather than displaying 32 bytes of memory, the next increment above 11in hexadecimal (17 in base 10), it displays 16 bytes of memory, which is consistent with our hypothesis that the 11is being treated as a base-10 value. Hence, the refined hypothesis is that the program is treating the byte count as storage address operands, and the storage addresses on the output listing as decimal values.

The last step is to prove this hypothesis. Looking at the fourth test case, if 8000 is interpreted as a decimal number, the corresponding base-16 value is 1F40, which would lead to the output shown. As further proof, examine the second test case. The output is incorrect, but if21and29are treated as decimal numbers, the locations of storage addresses 15–1D would be displayed; this is consistent with the erroneous result of the test case. Hence, we have almost certainly located the error: The program is assuming that the operands are decimal values and is printing the memory addresses as decimal values, which is inconsistent with the specification. 166 The Art of Software Testing

C08 08/17/2011 1:8:15 Page 167

Moreover, this error seems to be the cause of the erroneous results of all four test cases. A little thought has led to the error, and it also solved three other problems that, at first glance, appear to be unrelated.

Note that the error probably manifests itself at two locations in the program: the part that interprets the input command and the part that prints memory addresses on the output listing.

As an aside, this error, likely caused by a misunderstanding of the specification, reinforces the suggestion that a programmer should not attempt to test his or her own program. If the programmer who created this error is also designing the test cases, he or she likely will make the same mistake while writing the test cases. In other words, the programmer’s expected outputs would not be those of Figure 8.5; they would be the outputs calcu- lated under the assumption that the operands are decimal values. There- fore, this fundamental error probably would go unnoticed.

In document FFIRS 08/25/ :31:15 Page 2 (Page 177-181)