Summary - Evaluating data structures for runtime storage of aspect instances

The unified model helps us to answer the first two of our research questions.

Question #1: Which operations are required to implement instantiation policies in general?

We identified four basic properties that are used to implement an instantiation property:

1. Thebindfunction that defines how the query key-tuple is built from context values.

2. Theimplicitflag that determines if an instantiation policy supports implicit instantiation.

3. The store function that registers associations between key tuples and and aspect instances.

4. Thefindfunction that is used to find aspect instances for a given key-tuple. We assume that aspects stay deployed once they have been deployed. Otherwise, a function to remove existing associations is required in addition.

Question #2: Which operations can be optimised without affecting the semantics of in-

stantiation policies?

Thestoreandfindfunctions are independent of the semantics of an instantiation

policy. Therefore, we are interested in optimising these two functions. Thebind

function, on the other hand, defines the semantics of an instantiation policy. We expect this function to vary for each instantiation policy.

5 Criteria

To compare data structures and the operations that are executed on them, criteria need to be defined. As it is our goal to make the look-up of aspect instances faster, our main criterion is the time that the different operations take to perform the operations defined in Section 4, specifically the find and store functions. We differentiate between the theoretic asymptotic time complexity complexity of those operations and the clock time used by actual implementations of the operations and data structures. However, the amount of memory that the data structures occupy in relation to the number of aspect instances is also of importance, as memory is as much a limited resource as time in certain usage scenarios (such as mobile devices).

5.1 Asymptotic time complexity

One way to describe and compare the theoretic amount of time different algorithms require for a specific task is the(asymptotic) computational complexity [21]. It is used to describe the time and space requirements of algorithms and computational problems with respect to the size of the input data. Because the actual computational complexity of an algorithm often depends on the input data, the actual time and space requirements depends on the individual case. As not all cases can be considered in practice, typically only thebest,worst andaverage cases are investigated.

Example 16. Given an unsorted list of numbers of length

N

one wants to find out if a specific number is present in this list. Because the list is unsorted one needs to compare the elements in the list one by one until the desired element is found or the end of the list is reached. Let

L= [4,2,7,5]

be such a list where

N

= 4

. Let us further assume that we always begin at the first element of the list when searching for a specific element. If we want to find out if

2

is an element of this list, we start by comparing the first element (

4

) to

2

. They are not equal, so we proceed to the next element (

2

) and compare it to

2

. The element is equal to the sought element. It took 2 comparisons to find the element. If we try to find out if

4

is an element of the list we need to compare the first element only, as this is already the sought element. We call this thebest case, that is, the quickest way to complete the algorithm: at least one element in the list has to be compared12. On the other hand, if we try to find out if

8

is an element of the list we again need to compare the elements one by one to the sought value. In this case, however, four comparisons are required, one for each element in the list, because

8

is not an element in this list. This isworst case: it takes at most

N

comparisons to find an element in an unsorted list with

N

elements.

In addition to the best and worst case, the average case can also be considered. That is, on average, how many comparisons does it take to find an element in the list. Typically the average case is harder to determine because it is difficult to define the average input. It often depends on probabilistic properties. For example, if it is equally likely that the sought element is present at position 1, 2, 3, 4 or not part of the list, the average number of comparisons is

(1 + 2 + 3 + 4 + 4)/5 = 2.8

13_.

The Big-O notation [22] is usually used to give an upper limit of the computational complexity and it is the measurement that is typically used when quantifying computational complexity. If the computational complexity of an algorithm is given by

f(n)

(where

n

is a quantification of the data on which the algorithm works), then

f(n)∈O(g(n))

means that the computational complexity of this algorithm does not grow faster than

M g(n)

(where

M

is a constant). That is

|f(x)| ≤M|g(x)|

for

x > x

0: from a certain

x > x

f(x)

is less than or equal to the value of

g(x)

multiplied by a constant.

Example 17. If

f(N)

gives the number of comparisons required to find a specific element in a list of length

N

, it will never be larger than

N

, no matter how long the list is – at most

N

comparisons are necessary. We can therefore say that

f(N)∈O(N)

. We can also define

f(N)

to give the number of operations executed by a computer program that implements the algorithm to find the element. Such a program would basically perform a loop over all elements in the list to compare each element with the sought value until it is found or the whole list has been searched. The actual number of operations executed by such a program would be no more than

KN+C

, where

N

is the length of the list,

K

is the number of operations required to compare a single element and

C

is some constant overhead to set up loop variables etc. Still,

f(n)∈O(n)

, assuming

N >0

, as this says that

KN

+C

≤M N

⇐⇒

K+

_/

_≤_M

For any combination of

K

and

C

, we can find an

M

such that

M

≥K+

_/

_N_{. For}

example, assume that

K= 8

and

C

= 4

. Then

M

= 12

would suffice this inequation for any

N >0

. For

N

= 1

12≥8 +

_/

₁_{and the fraction}4

_/

_N _{will become smaller the} larger

N

gets. Therefore the inequation will still hold for

N >1

We can therefore say that the time complexity of searching a list linearly has an upper limit of

O(N)

: if the list is twice as long, in theory searching in this list takes

twice as long, too, in the worst case.

A selection of typical complexity classes is shown in Table 2 on page 41. The complexity classes help us compare algorithms. A sorting algorithm that has a worst

case complexity of

O(N·log N)

is expected to work faster for large input datasets than a sorting algorithm of class

O(N

)

. This even holds for different implementations: for sufficiently large input data even a highly optimized

O(N

)

algorithm will take longer than a badly optimized

O(N·log N)

algorithm.

Table 2: Examples of complexity classes Complexity class Description

O(1)

The complexity does not depend on the size of the input

dataset.

O(N)

The complexity grows linearly with the size of the input dataset.

O(N

)

The growth of the complexity is quadratic to the size of the

input dataset.

O(log N)

The growth of the complexity is logarithmically to the size of

the input dataset.

The best, average and worse case can all be classified using the Big-O notation. That is, the Big-O notation does not automatically refer to the worst case complexity of an algorithm, although typically, the complexity of the worst case is used to classify algorithms. For example, theBubblesort sort algorithm [25, p. 106] has a worst case complexity (if the list is reversely sorted) of

O(n

)

, but a best case complexity (if the list is sorted) of

O(n)

. Still, Bubblesort is typically referred to as a

O(n

)

algorithm.

In document Evaluating data structures for runtime storage of aspect instances (Page 41-45)