• No results found

To aid in understanding the overall strategy of merging when information loss will be minimised due to variables falling out of scope, two small exam- ples are presented here. Throughout this section, a branch of the analysis

1 def g t 5 ( x ) 2 i f x > 5 : 3 r e s u l t = 1 4 e l s e : 5 r e s u l t = 0 6 del x 7 return r e s u l t 8 9 a n a l y s e ( g t 5 ( Range ( 0 , 1 0 ) ) )

Figure 5.2: A Trivial Example

will be referred to by the notation {pc, variables} where pc denotes the cur- rent program counter and variables denotes a fixed domain function from variable names to the range of values the variable can take, and the inferred properties made on the variables relation with other variables.

5.2.1 Trivial Example

Figure 5.2 details a trivial example. The function gt5 simply checks if a value is greater than 5. The example requests an analysis of the gt5 function with the input range 0 to 10. Hence, the analysis starts at line 1 with the state {1, (x = (0, 10))}. Obviously, as soon as the conditional on Line 2 is evaluated, the analysis must consider both possibilities. This splits the analysis so that there are two paths being considered, {2, (x = (5, 10))}, and {2, (x = (0, 5])}. As both of these contain the same number of variables and have evaluated the same number of instructions, the order in which they are evaluated is irrelevant.

Hence the analysis picks {2, (x = (5, 10))} and continues evaluation. The variable result is created and assigned the value 1, before Line 6 is reached and the variable x is deleted. This causes the analyser to re-evaluate which path should be analysed. The states are now {2, (x = (0, 5))} and {7, (result = 1)}. Whilst both contain the same number of variable, this time {2, (x = (0, 5))} has the lower program counter.

Looking for a chance to re-merge the forked states, the analysis changes the state it is evaluating and proceeds from {2, (x = (0, 5))}. In this state, result is created with the value 0, before the analysis again arrives at Line 6 and deletes x, again causing the analyser to reevaluate the paths. In this

1 def f ( y ) : 2 while y < 1 0 : 3 y ∗= y 4 return y 5 6 def g ( y ) : 7 z = 0 8 while z < y : 9 z += 50 10 return z 11 12 def m( x ) : 13 x = f ( x ) 14 x = g ( x ) 15 return x 16 17 a n a l y s e (m( Range ( 2 , 1 5 ) ) )

Figure 5.3: A Contrived, More Complicated Example

case, however, it sees that both states have the same program counter, and are therefore suitable for merging. Hence, the analysis merges both states to get {7, (result = [0, 1])} , and then continues. The function gt5 terminates immediately afterwards, with the analyser having explored all possible paths through the program, and finding that at most 5 lines of the program will execute.

5.2.2 While Loop Example

Figure 5.3 illustrates a more contrived example with a non-trivial program flow. The analyser is requested to analyse the function m with the range of values (2, 15). Hence the analysis starts at Line 12. The first instruction calls function f . Importantly, the function initialises a new variable, and hence the state on entering f is {1, (x = (2, 15), y = (2, 15))}.

Upon encountering the loop on Line 2, the analysis must consider two paths, and hence splits the analysis to {2, (x = (2, 15), y = [10, 15))} and {2, (x = (2, 15), y = (2, 10))}, where x is as before and the new variable is y defined inside the scope of the function f . As these contain the same number of variables, the order they are analysed is irrelevant. Picking {2, (x =

(2, 15), y = [10, 15))} first, the analysis simply skips the loop and returns the value. On return, the variable y drops out of scope, and hence after reaching the state {14, x = [10, 15)} the analysis returns to evaluate the state {2, (x = (2, 15), y = (2, 10))}, as it has more variables.

After executing the body of the while loop, the analysis again evaluates the while loop condition with the state {2, (x = (2, 15), y = (4, 100))}. This again results in a split to two states, with {2, (x = (2, 15), y = [10, 100))} and {2, (x = (2, 15), y = (4, 10))} being considered. As in the first case, the false branch {2, (x = (2, 15), y = [10, 100))} is picked first and simply evaluated until y falls out of scope, at {14, x = [10, 100)}. As two states now exist with the same program counter, they are merged, resulting in {14, x = [10, 100)}, and the analysis goes back to consider the remaining state.

With the state {2, (x = (2, 15), y = (4, 10))}, the body of the while loop must be executed a second time, resulting in {2, (x = (2, 15), y = (16, 100))}. In this case, the condition of the while loop must succeed, and hence splitting the evaluation for a third time is unnecessary. Hence this state is evaluated until y falls out of scope, resulting in {14, x = (16, 100)}. As this state now has the same program counter as the previous state, they are merged, and once again the result of this is {14, x = [10, 100)}.

Continuing with the state {14, x = [10, 100)}, Function g is called, which creates two new variables in its scope, y and z. Trivially, the first encounter with the while loop on Line 8 will cause the loops body to be executed. Hence the state at the second encounter is {8, x = [10, 100), y = [10, 100), z = 50}. This results in a split, with {8, x = [10, 100), y = [10, 50), z = 50} and {8, x = [10, 100), y = [50, 100), z = 50} being considered. Picking the first path, this simply continues to evaluate until y and z fall out of scope at the return statement on Line 10, resulting in the state {15, (x = 50)}.

Returning to the previous state {8, x = [10, 100), y = [50, 100), z = 50}, the body of the loop is executed a second time, advancing the state to {8, x = [10, 100), y = [50, 100), z = 100}. This time, due to y being a half- open interval, the condition z < y fails and the loop body is not executed. Hence the evaluation continues until the y and z fall out of scope on return, giving the state {15, (x = 100)}. The two states are then merged, giving {15, (x = 50 or x = 100)}, and the evaluation terminates on the return of m.

Value Range Assumptions = < > != <= >= To other Values Exclusions Usage

Figure 5.4: The information contained within a single abstract variable

Hence the analysis concludes that the possible return values of m are 50 or 100, that the loop inside function f executes at most 3 times, and the loop inside function g executes at most 2 times.

Having demonstrated the overall approach of the algorithm, it follows to give a detailed implementation.